Re: [lng-odp] [PATCH v1] linux-gen: add runtime configuration file
Matias Elo(matiaselo) replied on github web page: platform/linux-generic/libodp-linux.pc.in line 7 @@ -7,5 +7,5 @@ Name: libodp-linux Description: The ODP packet processing engine Version: @PKGCONFIG_VERSION@ Libs: -L${libdir} -lodp-linux -Libs.private: @OPENSSL_STATIC_LIBS@ @DPDK_LIBS@ @PCAP_LIBS@ @PTHREAD_LIBS@ @TIMER_LIBS@ -lpthread @ATOMIC_LIBS@ -Cflags: -I${includedir} +Libs.private: @OPENSSL_STATIC_LIBS@ @LIBCONFIG_STATIC_LIBS@ @DPDK_LIBS@ @PCAP_LIBS@ @PTHREAD_LIBS@ @TIMER_LIBS@ -lpthread @ATOMIC_LIBS@ +Cflags: -I${includedir} @LIBCONFIG_CPPFLAGS@ Comment: OK, will fix. > Matias Elo(matiaselo) wrote: > I'm usually using multiple different NICs, so I find this information useful > for logging test runs. Perhaps use ODP_PRINT() here? >> Matias Elo(matiaselo) wrote: >> Will fix. >>> Dmitry Eremin-Solenikov(lumag) wrote: >>> Just use `PKG_CHECK_MODULES` here rather than inventing variables and >>> checks. Dmitry Eremin-Solenikov(lumag) wrote: Ideally we do not need this field. We should always return valid setting. Maybe by allowing ODP modules to return default value. > Dmitry Eremin-Solenikov(lumag) wrote: > I'd prefer not to export config internals here, but rather have > `_odp_config_lookup_int()`, `_odp_config_lookup_string()`, etc. >> Dmitry Eremin-Solenikov(lumag) wrote: >> Last line should not be necessary >>> bogdanPricope wrote >>> OFP is using a similar naming for the environment variable but with >>> 'CONF' instead of 'CONFIG' ('OFP_CONF_FILE'). Will it makes sense to >>> change the naming to avoid confusions: either 'ODP_CONF_FILE' in ODP or >>> 'OFP_CONFIG_FILE' in OFP? Bill Fischofer(Bill-Fischofer-Linaro) wrote: Do we really want to print this unconditionally? In any event shouldn't this be `ODP_LOG()` here rather than `printf()`? > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > According to the [libconfig > changelog](https://github.com/hyperrealm/libconfig/blob/master/ChangeLog) > there are versions pre-1.0 (e.g., 0.9) which would fail this test. > This needs to be reversed so that you use the newer form for v1.5 and > higher levels: > ``` > #if (LIBCONFIG_VER_MAJOR > 1 || (LIBCONFIG_VER_MAJOR == 1 && > LIBCONFIG_VER_MINOR >= 5)) ... > ``` >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Does this have to be a hard dependency? Can we have this feature be >> omitted (hardcoded defaults are used) if libconfig is not available? https://github.com/Linaro/odp/pull/499#discussion_r170558253 updated_at 2018-02-26 11:06:52
Re: [lng-odp] [PATCH v1] linux-gen: add runtime configuration file
Matias Elo(matiaselo) replied on github web page: platform/linux-generic/pktio/dpdk.c line 51 @@ -93,6 +95,56 @@ void refer_constructors(void) } #endif +static void lookup_opt(config_setting_t *default_opt, config_setting_t *drv_opt, + const char *opt, int *val) +{ + /* Default option */ + config_setting_lookup_int(default_opt, opt, val); + + /* Driver option overwrites default option */ + if (drv_opt) + config_setting_lookup_int(drv_opt, opt, val); +} + +static void init_options(pktio_entry_t *pktio_entry, +const struct rte_eth_dev_info *dev_info) +{ + dpdk_opt_t *opt = _entry->s.pkt_dpdk.opt; + config_setting_t *default_opt; + config_setting_t *drv_opt; + + /* Default values. Update 'config/odp-linux.conf' if modified. */ + opt->num_rx_desc = DPDK_NM_RX_DESC; + opt->num_tx_desc = DPDK_NM_TX_DESC; + opt->rx_drop_en = 0; + + default_opt = _odp_libconfig_lookup("pktio_dpdk"); + if (!default_opt) { + ODP_DBG("No DPDK pktio options found\n"); + goto done; + } + + /* config_lookup_from() was renamed to config_setting_lookup() in +* libconfig 1.5.0 */ +#if (LIBCONFIG_VER_MAJOR <= 1 && LIBCONFIG_VER_MINOR < 5) Comment: Will fix. > Dmitry Eremin-Solenikov(lumag) wrote: > Just use `PKG_CHECK_MODULES` here rather than inventing variables and checks. >> Dmitry Eremin-Solenikov(lumag) wrote: >> Ideally we do not need this field. We should always return valid setting. >> Maybe by allowing ODP modules to return default value. >>> Dmitry Eremin-Solenikov(lumag) wrote: >>> I'd prefer not to export config internals here, but rather have >>> `_odp_config_lookup_int()`, `_odp_config_lookup_string()`, etc. Dmitry Eremin-Solenikov(lumag) wrote: Last line should not be necessary > bogdanPricope wrote > OFP is using a similar naming for the environment variable but with > 'CONF' instead of 'CONFIG' ('OFP_CONF_FILE'). Will it makes sense to > change the naming to avoid confusions: either 'ODP_CONF_FILE' in ODP or > 'OFP_CONFIG_FILE' in OFP? >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Do we really want to print this unconditionally? In any event shouldn't >> this be `ODP_LOG()` here rather than `printf()`? >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> According to the [libconfig >>> changelog](https://github.com/hyperrealm/libconfig/blob/master/ChangeLog) >>> there are versions pre-1.0 (e.g., 0.9) which would fail this test. >>> This needs to be reversed so that you use the newer form for v1.5 and >>> higher levels: >>> ``` >>> #if (LIBCONFIG_VER_MAJOR > 1 || (LIBCONFIG_VER_MAJOR == 1 && >>> LIBCONFIG_VER_MINOR >= 5)) ... >>> ``` Bill Fischofer(Bill-Fischofer-Linaro) wrote: Does this have to be a hard dependency? Can we have this feature be omitted (hardcoded defaults are used) if libconfig is not available? https://github.com/Linaro/odp/pull/499#discussion_r170554301 updated_at 2018-02-26 10:50:27
Re: [lng-odp] [PATCH v1] linux-gen: add runtime configuration file
Matias Elo(matiaselo) replied on github web page: platform/linux-generic/pktio/dpdk.c line 67 @@ -93,6 +95,56 @@ void refer_constructors(void) } #endif +static void lookup_opt(config_setting_t *default_opt, config_setting_t *drv_opt, + const char *opt, int *val) +{ + /* Default option */ + config_setting_lookup_int(default_opt, opt, val); + + /* Driver option overwrites default option */ + if (drv_opt) + config_setting_lookup_int(drv_opt, opt, val); +} + +static void init_options(pktio_entry_t *pktio_entry, +const struct rte_eth_dev_info *dev_info) +{ + dpdk_opt_t *opt = _entry->s.pkt_dpdk.opt; + config_setting_t *default_opt; + config_setting_t *drv_opt; + + /* Default values. Update 'config/odp-linux.conf' if modified. */ + opt->num_rx_desc = DPDK_NM_RX_DESC; + opt->num_tx_desc = DPDK_NM_TX_DESC; + opt->rx_drop_en = 0; + + default_opt = _odp_libconfig_lookup("pktio_dpdk"); + if (!default_opt) { + ODP_DBG("No DPDK pktio options found\n"); + goto done; + } + + /* config_lookup_from() was renamed to config_setting_lookup() in +* libconfig 1.5.0 */ +#if (LIBCONFIG_VER_MAJOR <= 1 && LIBCONFIG_VER_MINOR < 5) + drv_opt = config_lookup_from(default_opt, dev_info->driver_name); +#else + drv_opt = config_setting_lookup(default_opt, dev_info->driver_name); +#endif + + /* Read options from config file */ + lookup_opt(default_opt, drv_opt, "num_rx_desc", >num_rx_desc); + lookup_opt(default_opt, drv_opt, "num_tx_desc", >num_tx_desc); + lookup_opt(default_opt, drv_opt, "rx_drop_en", >rx_drop_en); + +done: + printf("DPDK interface (%s): %" PRIu16 "\n", dev_info->driver_name, + pktio_entry->s.pkt_dpdk.port_id); + printf(" num_rx_desc: %d\n", opt->num_rx_desc); + printf(" num_tx_desc: %d\n", opt->num_tx_desc); + printf(" rx_drop_en: %d\n", opt->rx_drop_en); Comment: I'm usually using multiple different NICs, so I find this information useful for logging test runs. Perhaps use ODP_PRINT() here? > Matias Elo(matiaselo) wrote: > Will fix. >> Dmitry Eremin-Solenikov(lumag) wrote: >> Just use `PKG_CHECK_MODULES` here rather than inventing variables and checks. >>> Dmitry Eremin-Solenikov(lumag) wrote: >>> Ideally we do not need this field. We should always return valid setting. >>> Maybe by allowing ODP modules to return default value. Dmitry Eremin-Solenikov(lumag) wrote: I'd prefer not to export config internals here, but rather have `_odp_config_lookup_int()`, `_odp_config_lookup_string()`, etc. > Dmitry Eremin-Solenikov(lumag) wrote: > Last line should not be necessary >> bogdanPricope wrote >> OFP is using a similar naming for the environment variable but with >> 'CONF' instead of 'CONFIG' ('OFP_CONF_FILE'). Will it makes sense to >> change the naming to avoid confusions: either 'ODP_CONF_FILE' in ODP or >> 'OFP_CONFIG_FILE' in OFP? >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Do we really want to print this unconditionally? In any event shouldn't >>> this be `ODP_LOG()` here rather than `printf()`? Bill Fischofer(Bill-Fischofer-Linaro) wrote: According to the [libconfig changelog](https://github.com/hyperrealm/libconfig/blob/master/ChangeLog) there are versions pre-1.0 (e.g., 0.9) which would fail this test. This needs to be reversed so that you use the newer form for v1.5 and higher levels: ``` #if (LIBCONFIG_VER_MAJOR > 1 || (LIBCONFIG_VER_MAJOR == 1 && LIBCONFIG_VER_MINOR >= 5)) ... ``` > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Does this have to be a hard dependency? Can we have this feature be > omitted (hardcoded defaults are used) if libconfig is not available? https://github.com/Linaro/odp/pull/499#discussion_r170558103 updated_at 2018-02-26 11:06:13
Re: [lng-odp] [PATCH v1] linux-gen: add runtime configuration file
Dmitry Eremin-Solenikov(lumag) replied on github web page: m4/odp_libconfig.m4 line 7 @@ -0,0 +1,60 @@ +# ODP_LIBCONFIG([ACTION-IF-FOUND], [ACTION-IF-NOT-FOUND]) +# - +AC_DEFUN([ODP_LIBCONFIG], +[dnl +AC_ARG_VAR([LIBCONFIG_CPPFLAGS], [C preprocessor flags for libconfig]) +AC_ARG_VAR([LIBCONFIG_LIBS], [linker flags for libconfig library]) +AC_ARG_VAR([LIBCONFIG_STATIC_LIBS], [static linker flags for libconfig library]) Comment: Just use `PKG_CHECK_MODULES` here rather than inventing variables and checks. > Dmitry Eremin-Solenikov(lumag) wrote: > Ideally we do not need this field. We should always return valid setting. > Maybe by allowing ODP modules to return default value. >> Dmitry Eremin-Solenikov(lumag) wrote: >> I'd prefer not to export config internals here, but rather have >> `_odp_config_lookup_int()`, `_odp_config_lookup_string()`, etc. >>> Dmitry Eremin-Solenikov(lumag) wrote: >>> Last line should not be necessary bogdanPricope wrote OFP is using a similar naming for the environment variable but with 'CONF' instead of 'CONFIG' ('OFP_CONF_FILE'). Will it makes sense to change the naming to avoid confusions: either 'ODP_CONF_FILE' in ODP or 'OFP_CONFIG_FILE' in OFP? > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Do we really want to print this unconditionally? In any event shouldn't > this be `ODP_LOG()` here rather than `printf()`? >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> According to the [libconfig >> changelog](https://github.com/hyperrealm/libconfig/blob/master/ChangeLog) >> there are versions pre-1.0 (e.g., 0.9) which would fail this test. This >> needs to be reversed so that you use the newer form for v1.5 and higher >> levels: >> ``` >> #if (LIBCONFIG_VER_MAJOR > 1 || (LIBCONFIG_VER_MAJOR == 1 && >> LIBCONFIG_VER_MINOR >= 5)) ... >> ``` >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Does this have to be a hard dependency? Can we have this feature be >>> omitted (hardcoded defaults are used) if libconfig is not available? https://github.com/Linaro/odp/pull/499#discussion_r170518543 updated_at 2018-02-26 08:22:22
Re: [lng-odp] [PATCH v1] linux-gen: add runtime configuration file
Dmitry Eremin-Solenikov(lumag) replied on github web page: platform/linux-generic/include/odp_internal.h line 13 @@ -55,10 +56,13 @@ struct odp_global_data_s { odp_cpumask_t control_cpus; odp_cpumask_t worker_cpus; int num_cpus_installed; + config_t libconfig; /*< Runtime config using libconfig */ + uint8_t libconfig_enabled; /*< Runtime config enabled */ Comment: Ideally we do not need this field. We should always return valid setting. Maybe by allowing ODP modules to return default value. > Dmitry Eremin-Solenikov(lumag) wrote: > I'd prefer not to export config internals here, but rather have > `_odp_config_lookup_int()`, `_odp_config_lookup_string()`, etc. >> Dmitry Eremin-Solenikov(lumag) wrote: >> Last line should not be necessary >>> bogdanPricope wrote >>> OFP is using a similar naming for the environment variable but with 'CONF' >>> instead of 'CONFIG' ('OFP_CONF_FILE'). Will it makes sense to change the >>> naming to avoid confusions: either 'ODP_CONF_FILE' in ODP or >>> 'OFP_CONFIG_FILE' in OFP? Bill Fischofer(Bill-Fischofer-Linaro) wrote: Do we really want to print this unconditionally? In any event shouldn't this be `ODP_LOG()` here rather than `printf()`? > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > According to the [libconfig > changelog](https://github.com/hyperrealm/libconfig/blob/master/ChangeLog) > there are versions pre-1.0 (e.g., 0.9) which would fail this test. This > needs to be reversed so that you use the newer form for v1.5 and higher > levels: > ``` > #if (LIBCONFIG_VER_MAJOR > 1 || (LIBCONFIG_VER_MAJOR == 1 && > LIBCONFIG_VER_MINOR >= 5)) ... > ``` >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Does this have to be a hard dependency? Can we have this feature be >> omitted (hardcoded defaults are used) if libconfig is not available? https://github.com/Linaro/odp/pull/499#discussion_r170518104 updated_at 2018-02-26 08:22:22
Re: [lng-odp] [PATCH v1] linux-gen: add runtime configuration file
Dmitry Eremin-Solenikov(lumag) replied on github web page: platform/linux-generic/libodp-linux.pc.in line 7 @@ -7,5 +7,5 @@ Name: libodp-linux Description: The ODP packet processing engine Version: @PKGCONFIG_VERSION@ Libs: -L${libdir} -lodp-linux -Libs.private: @OPENSSL_STATIC_LIBS@ @DPDK_LIBS@ @PCAP_LIBS@ @PTHREAD_LIBS@ @TIMER_LIBS@ -lpthread @ATOMIC_LIBS@ -Cflags: -I${includedir} +Libs.private: @OPENSSL_STATIC_LIBS@ @LIBCONFIG_STATIC_LIBS@ @DPDK_LIBS@ @PCAP_LIBS@ @PTHREAD_LIBS@ @TIMER_LIBS@ -lpthread @ATOMIC_LIBS@ +Cflags: -I${includedir} @LIBCONFIG_CPPFLAGS@ Comment: Last line should not be necessary > bogdanPricope wrote > OFP is using a similar naming for the environment variable but with 'CONF' > instead of 'CONFIG' ('OFP_CONF_FILE'). Will it makes sense to change the > naming to avoid confusions: either 'ODP_CONF_FILE' in ODP or > 'OFP_CONFIG_FILE' in OFP? >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Do we really want to print this unconditionally? In any event shouldn't this >> be `ODP_LOG()` here rather than `printf()`? >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> According to the [libconfig >>> changelog](https://github.com/hyperrealm/libconfig/blob/master/ChangeLog) >>> there are versions pre-1.0 (e.g., 0.9) which would fail this test. This >>> needs to be reversed so that you use the newer form for v1.5 and higher >>> levels: >>> ``` >>> #if (LIBCONFIG_VER_MAJOR > 1 || (LIBCONFIG_VER_MAJOR == 1 && >>> LIBCONFIG_VER_MINOR >= 5)) ... >>> ``` Bill Fischofer(Bill-Fischofer-Linaro) wrote: Does this have to be a hard dependency? Can we have this feature be omitted (hardcoded defaults are used) if libconfig is not available? https://github.com/Linaro/odp/pull/499#discussion_r170516985 updated_at 2018-02-26 08:22:22
Re: [lng-odp] [PATCH v1] linux-gen: add runtime configuration file
bogdanPricope replied on github web page: .travis.yml line 13 @@ -254,7 +255,7 @@ script: - make -j $(nproc) - mkdir /dev/shm/odp - if [ -z "$CROSS_ARCH" ] ; then - sudo LD_LIBRARY_PATH="$HOME/cunit-install/$CROSS_ARCH/lib:$LD_LIBRARY_PATH" ODP_SHM_DIR=/dev/shm/odp make check ; + sudo ODP_CONFIG_FILE="`pwd`/config/odp-linux.conf" LD_LIBRARY_PATH="$HOME/cunit-install/$CROSS_ARCH/lib:$LD_LIBRARY_PATH" ODP_SHM_DIR=/dev/shm/odp make check ; Comment: OFP is using a similar naming for the environment variable but with 'CONF' instead of 'CONFIG' ('OFP_CONF_FILE'). Will it makes sense to change the naming to avoid confusions: either 'ODP_CONF_FILE' in ODP or 'OFP_CONFIG_FILE' in OFP? > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Do we really want to print this unconditionally? In any event shouldn't this > be `ODP_LOG()` here rather than `printf()`? >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> According to the [libconfig >> changelog](https://github.com/hyperrealm/libconfig/blob/master/ChangeLog) >> there are versions pre-1.0 (e.g., 0.9) which would fail this test. This >> needs to be reversed so that you use the newer form for v1.5 and higher >> levels: >> ``` >> #if (LIBCONFIG_VER_MAJOR > 1 || (LIBCONFIG_VER_MAJOR == 1 && >> LIBCONFIG_VER_MINOR >= 5)) ... >> ``` >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Does this have to be a hard dependency? Can we have this feature be omitted >>> (hardcoded defaults are used) if libconfig is not available? https://github.com/Linaro/odp/pull/499#discussion_r170512105 updated_at 2018-02-26 07:40:52
Re: [lng-odp] [PATCH v1] linux-gen: add runtime configuration file
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/pktio/dpdk.c line 67 @@ -93,6 +95,56 @@ void refer_constructors(void) } #endif +static void lookup_opt(config_setting_t *default_opt, config_setting_t *drv_opt, + const char *opt, int *val) +{ + /* Default option */ + config_setting_lookup_int(default_opt, opt, val); + + /* Driver option overwrites default option */ + if (drv_opt) + config_setting_lookup_int(drv_opt, opt, val); +} + +static void init_options(pktio_entry_t *pktio_entry, +const struct rte_eth_dev_info *dev_info) +{ + dpdk_opt_t *opt = _entry->s.pkt_dpdk.opt; + config_setting_t *default_opt; + config_setting_t *drv_opt; + + /* Default values. Update 'config/odp-linux.conf' if modified. */ + opt->num_rx_desc = DPDK_NM_RX_DESC; + opt->num_tx_desc = DPDK_NM_TX_DESC; + opt->rx_drop_en = 0; + + default_opt = _odp_libconfig_lookup("pktio_dpdk"); + if (!default_opt) { + ODP_DBG("No DPDK pktio options found\n"); + goto done; + } + + /* config_lookup_from() was renamed to config_setting_lookup() in +* libconfig 1.5.0 */ +#if (LIBCONFIG_VER_MAJOR <= 1 && LIBCONFIG_VER_MINOR < 5) + drv_opt = config_lookup_from(default_opt, dev_info->driver_name); +#else + drv_opt = config_setting_lookup(default_opt, dev_info->driver_name); +#endif + + /* Read options from config file */ + lookup_opt(default_opt, drv_opt, "num_rx_desc", >num_rx_desc); + lookup_opt(default_opt, drv_opt, "num_tx_desc", >num_tx_desc); + lookup_opt(default_opt, drv_opt, "rx_drop_en", >rx_drop_en); + +done: + printf("DPDK interface (%s): %" PRIu16 "\n", dev_info->driver_name, + pktio_entry->s.pkt_dpdk.port_id); + printf(" num_rx_desc: %d\n", opt->num_rx_desc); + printf(" num_tx_desc: %d\n", opt->num_tx_desc); + printf(" rx_drop_en: %d\n", opt->rx_drop_en); Comment: Do we really want to print this unconditionally? In any event shouldn't this be `ODP_LOG()` here rather than `printf()`? > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > According to the [libconfig > changelog](https://github.com/hyperrealm/libconfig/blob/master/ChangeLog) > there are versions pre-1.0 (e.g., 0.9) which would fail this test. This needs > to be reversed so that you use the newer form for v1.5 and higher levels: > ``` > #if (LIBCONFIG_VER_MAJOR > 1 || (LIBCONFIG_VER_MAJOR == 1 && > LIBCONFIG_VER_MINOR >= 5)) ... > ``` >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Does this have to be a hard dependency? Can we have this feature be omitted >> (hardcoded defaults are used) if libconfig is not available? https://github.com/Linaro/odp/pull/499#discussion_r170353045 updated_at 2018-02-23 20:00:18
Re: [lng-odp] [PATCH v1] changelog: updates for v1.18.0.0
Dmitry Eremin-Solenikov(lumag) replied on github web page: CHANGELOG line 209 @@ -1,3 +1,217 @@ +== OpenDataPlane (1.18.0.0) +=== New Features +ODP v1.18.0.0 is Tiger Moth Release Candidate 2 (RC 2). It completes the new +APIs that are part of the Tiger Moth Long Term Support (LTS) release of ODP +as well as various performance refinements and bug fixes. As of RC2 the ODP +API is now frozen for the Tiger Moth development series. + + APIs +The following new and changed APIs are included in this release: + += Addition of Shared Memory (SHM) Initialization Parameters +The `odp_init_t` struct used as the argument to the `odp_init_global()` API +has been expanded to include a `max_memory` field that specifies the maximum +amount of shared memory (shm) that the application will use. This is to +better enable ODP implementations to optimize their use of shared memory in +support of the application. If left as (or defaulted) to 0, the implementation +may choose a default limit for the application. + += Crypto Changes +A number of crypto refinements are included in this release: + +* The single initialization vector (`iv`) in the `odp_crypto_session_param_t` +is replaced by a separate `cipher_iv` and `auth_iv` fields. + +* The single initialization vector (`override_iv_ptr`) in the +`odp_crypto_op_param_t` is replaced by a separate `cipher_iv_ptr` and +`auth_iv_ptr` fields. + +* The special nature of GCM and GMAC cipher and authentication algorithms is +clarified in that these ciphers always combine ciphering with authentication +and hence require both to be specified when used. This is simply a +documentation change as this requirement has always existed. + +* Enumerations for AES_CCM cipher (`ODP_CIPHER_ALG_AES_CCM`) and +authentication (`ODP_AUTH_ALG_AES_CCM`) modes are added. + +* Enumeration for the AES_CMAC authentication algorithm +(`ODP_AUTH_ALG_AES_CMAC`) is added. + +* Enumerations for the ChaCha20-Poly1305 cipher +(`ODP_CIPHER_ALG_CHACHA20_POLY1305`) and authentication +(`ODP_AUTH_ALG_CHACHA20_POLY1305`) modes are added. + +* Enumeration for the SHA-384 authentication algorithm +(`ODP_AUTH_ALG_SHA384_HMAC`) is added. + +* Enumeration for the AES-XCBC-MAC authentication algorithm +(`ODP_AUTH_ALG_AES_XCBC_MAC`) is added. + += Lock-free and block-free queues +The `odp_nonblocking_t` enums introduced in ODP v1.17.0.0 are now returned +as separate `odp_queue_capability()` limits for plain and scheduled queues. The +ODP reference implementations now support `ODP_NONBLOCKING_LF` queues. + += User pointer initialized to NULL +The specification for `odp_packet_user_ptr()` is clarified that unless +overridden by `odp_packet_user_ptr_set()` the value of NULL will be returned. + += Removal of `ODP_PKTIN_WAIT` option +The `ODP_PKTIN_WAIT` option on `odp_pktin_recv_tmo()` and +`odp_pktin_recv_mq_tmo()` is removed. Timeout options now consist of +`ODP_PKTIN_NO_WAIT` and a user-supplied timeout value. Since this timeout +value can be specified to be arbitrarily long, there is no need for an +indefinite wait capability as provision of such a capability proved +problematic for some ODP implementations. + += Addition of packet protocol APIs +The APIs `odp_packet_l2_type()`, `odp_packet_l3_type()`, and +`odp_packet_l4_type()` are added to return the Layer 2, 3, and 4 protocols, +respectively, associated with packets that have been parsed to the +corresponding layer. If the packet was not parsed to the associated layer +these return `ODP_PROTO_Ln_TYPE_NONE`. + += Asynchronous ordered locks +Two new APIs, `odp_schedule_order_lock_start()` and +`odp_schedule_order_lock_wait()` are added to allow for asynchronous +ordered lock acquisition in addition to the existing synchronous +`odp_schedule_order_lock()` API. In some implementations and applications, +there may be a performance advantage to indicating the intent to acquire an +ordered lock to allow the implementation to prepare for this while the +application continues parallel processing and then enter the critical section +protected by the ordered lock at a later time. In this case ordered lock +protection is not guaranteed until the `odp_schedule_order_lock_wait()` call +returns. + += IPsec API miscellaneous changes and enhancements +IPsec support is further enhanced with the following: + +* The `odp_ipsec_ipv4_param_t` and `odp_ipsec_ipv6_param_t` structures +are added to formalize the specification of IPv4 and IPv6 options in the +`odp_ipsec_tunnel_param_t` configuration. + +* The `mode` field of the `odp_ipsec_out_t` is renamed to `frag_mode` for +better clarity. In addition the `flag.frag-mode` option bit in the +`odp_ipsec_out_opt_t` struct is defined to hold per-operation options for +the `odp_ipsec_out_param_t` struct. + +* The `odp_ipsec_capability_t` struct returned by the `odp_ipsec_capability()` +API is expanded to include the `odp_proto_chksums_t` available on inbound +IPsec traffic. This indicates whether and how inbound
Re: [lng-odp] [PATCH v1] changelog: updates for v1.18.0.0
Dmitry Eremin-Solenikov(lumag) replied on github web page: CHANGELOG line 200 @@ -1,3 +1,217 @@ +== OpenDataPlane (1.18.0.0) +=== New Features +ODP v1.18.0.0 is Tiger Moth Release Candidate 2 (RC 2). It completes the new +APIs that are part of the Tiger Moth Long Term Support (LTS) release of ODP +as well as various performance refinements and bug fixes. As of RC2 the ODP +API is now frozen for the Tiger Moth development series. + + APIs +The following new and changed APIs are included in this release: + += Addition of Shared Memory (SHM) Initialization Parameters +The `odp_init_t` struct used as the argument to the `odp_init_global()` API +has been expanded to include a `max_memory` field that specifies the maximum +amount of shared memory (shm) that the application will use. This is to +better enable ODP implementations to optimize their use of shared memory in +support of the application. If left as (or defaulted) to 0, the implementation +may choose a default limit for the application. + += Crypto Changes +A number of crypto refinements are included in this release: + +* The single initialization vector (`iv`) in the `odp_crypto_session_param_t` +is replaced by a separate `cipher_iv` and `auth_iv` fields. + +* The single initialization vector (`override_iv_ptr`) in the +`odp_crypto_op_param_t` is replaced by a separate `cipher_iv_ptr` and +`auth_iv_ptr` fields. + +* The special nature of GCM and GMAC cipher and authentication algorithms is +clarified in that these ciphers always combine ciphering with authentication +and hence require both to be specified when used. This is simply a +documentation change as this requirement has always existed. + +* Enumerations for AES_CCM cipher (`ODP_CIPHER_ALG_AES_CCM`) and +authentication (`ODP_AUTH_ALG_AES_CCM`) modes are added. + +* Enumeration for the AES_CMAC authentication algorithm +(`ODP_AUTH_ALG_AES_CMAC`) is added. + +* Enumerations for the ChaCha20-Poly1305 cipher +(`ODP_CIPHER_ALG_CHACHA20_POLY1305`) and authentication +(`ODP_AUTH_ALG_CHACHA20_POLY1305`) modes are added. + +* Enumeration for the SHA-384 authentication algorithm +(`ODP_AUTH_ALG_SHA384_HMAC`) is added. + +* Enumeration for the AES-XCBC-MAC authentication algorithm +(`ODP_AUTH_ALG_AES_XCBC_MAC`) is added. + += Lock-free and block-free queues +The `odp_nonblocking_t` enums introduced in ODP v1.17.0.0 are now returned +as separate `odp_queue_capability()` limits for plain and scheduled queues. The +ODP reference implementations now support `ODP_NONBLOCKING_LF` queues. + += User pointer initialized to NULL +The specification for `odp_packet_user_ptr()` is clarified that unless +overridden by `odp_packet_user_ptr_set()` the value of NULL will be returned. + += Removal of `ODP_PKTIN_WAIT` option +The `ODP_PKTIN_WAIT` option on `odp_pktin_recv_tmo()` and +`odp_pktin_recv_mq_tmo()` is removed. Timeout options now consist of +`ODP_PKTIN_NO_WAIT` and a user-supplied timeout value. Since this timeout +value can be specified to be arbitrarily long, there is no need for an +indefinite wait capability as provision of such a capability proved +problematic for some ODP implementations. + += Addition of packet protocol APIs +The APIs `odp_packet_l2_type()`, `odp_packet_l3_type()`, and +`odp_packet_l4_type()` are added to return the Layer 2, 3, and 4 protocols, +respectively, associated with packets that have been parsed to the +corresponding layer. If the packet was not parsed to the associated layer +these return `ODP_PROTO_Ln_TYPE_NONE`. + += Asynchronous ordered locks +Two new APIs, `odp_schedule_order_lock_start()` and +`odp_schedule_order_lock_wait()` are added to allow for asynchronous +ordered lock acquisition in addition to the existing synchronous +`odp_schedule_order_lock()` API. In some implementations and applications, +there may be a performance advantage to indicating the intent to acquire an +ordered lock to allow the implementation to prepare for this while the +application continues parallel processing and then enter the critical section +protected by the ordered lock at a later time. In this case ordered lock +protection is not guaranteed until the `odp_schedule_order_lock_wait()` call +returns. + += IPsec API miscellaneous changes and enhancements +IPsec support is further enhanced with the following: + +* The `odp_ipsec_ipv4_param_t` and `odp_ipsec_ipv6_param_t` structures +are added to formalize the specification of IPv4 and IPv6 options in the +`odp_ipsec_tunnel_param_t` configuration. + +* The `mode` field of the `odp_ipsec_out_t` is renamed to `frag_mode` for +better clarity. In addition the `flag.frag-mode` option bit in the +`odp_ipsec_out_opt_t` struct is defined to hold per-operation options for +the `odp_ipsec_out_param_t` struct. + +* The `odp_ipsec_capability_t` struct returned by the `odp_ipsec_capability()` +API is expanded to include the `odp_proto_chksums_t` available on inbound +IPsec traffic. This indicates whether and how inbound
Re: [lng-odp] [PATCH v1] changelog: updates for v1.18.0.0
Dmitry Eremin-Solenikov(lumag) replied on github web page: CHANGELOG line 197 @@ -1,3 +1,217 @@ +== OpenDataPlane (1.18.0.0) +=== New Features +ODP v1.18.0.0 is Tiger Moth Release Candidate 2 (RC 2). It completes the new +APIs that are part of the Tiger Moth Long Term Support (LTS) release of ODP +as well as various performance refinements and bug fixes. As of RC2 the ODP +API is now frozen for the Tiger Moth development series. + + APIs +The following new and changed APIs are included in this release: + += Addition of Shared Memory (SHM) Initialization Parameters +The `odp_init_t` struct used as the argument to the `odp_init_global()` API +has been expanded to include a `max_memory` field that specifies the maximum +amount of shared memory (shm) that the application will use. This is to +better enable ODP implementations to optimize their use of shared memory in +support of the application. If left as (or defaulted) to 0, the implementation +may choose a default limit for the application. + += Crypto Changes +A number of crypto refinements are included in this release: + +* The single initialization vector (`iv`) in the `odp_crypto_session_param_t` +is replaced by a separate `cipher_iv` and `auth_iv` fields. + +* The single initialization vector (`override_iv_ptr`) in the +`odp_crypto_op_param_t` is replaced by a separate `cipher_iv_ptr` and +`auth_iv_ptr` fields. + +* The special nature of GCM and GMAC cipher and authentication algorithms is +clarified in that these ciphers always combine ciphering with authentication +and hence require both to be specified when used. This is simply a +documentation change as this requirement has always existed. + +* Enumerations for AES_CCM cipher (`ODP_CIPHER_ALG_AES_CCM`) and +authentication (`ODP_AUTH_ALG_AES_CCM`) modes are added. + +* Enumeration for the AES_CMAC authentication algorithm +(`ODP_AUTH_ALG_AES_CMAC`) is added. + +* Enumerations for the ChaCha20-Poly1305 cipher +(`ODP_CIPHER_ALG_CHACHA20_POLY1305`) and authentication +(`ODP_AUTH_ALG_CHACHA20_POLY1305`) modes are added. + +* Enumeration for the SHA-384 authentication algorithm +(`ODP_AUTH_ALG_SHA384_HMAC`) is added. + +* Enumeration for the AES-XCBC-MAC authentication algorithm +(`ODP_AUTH_ALG_AES_XCBC_MAC`) is added. + += Lock-free and block-free queues +The `odp_nonblocking_t` enums introduced in ODP v1.17.0.0 are now returned +as separate `odp_queue_capability()` limits for plain and scheduled queues. The +ODP reference implementations now support `ODP_NONBLOCKING_LF` queues. + += User pointer initialized to NULL +The specification for `odp_packet_user_ptr()` is clarified that unless +overridden by `odp_packet_user_ptr_set()` the value of NULL will be returned. + += Removal of `ODP_PKTIN_WAIT` option +The `ODP_PKTIN_WAIT` option on `odp_pktin_recv_tmo()` and +`odp_pktin_recv_mq_tmo()` is removed. Timeout options now consist of +`ODP_PKTIN_NO_WAIT` and a user-supplied timeout value. Since this timeout +value can be specified to be arbitrarily long, there is no need for an +indefinite wait capability as provision of such a capability proved +problematic for some ODP implementations. + += Addition of packet protocol APIs +The APIs `odp_packet_l2_type()`, `odp_packet_l3_type()`, and +`odp_packet_l4_type()` are added to return the Layer 2, 3, and 4 protocols, +respectively, associated with packets that have been parsed to the +corresponding layer. If the packet was not parsed to the associated layer +these return `ODP_PROTO_Ln_TYPE_NONE`. + += Asynchronous ordered locks +Two new APIs, `odp_schedule_order_lock_start()` and +`odp_schedule_order_lock_wait()` are added to allow for asynchronous +ordered lock acquisition in addition to the existing synchronous +`odp_schedule_order_lock()` API. In some implementations and applications, +there may be a performance advantage to indicating the intent to acquire an +ordered lock to allow the implementation to prepare for this while the +application continues parallel processing and then enter the critical section +protected by the ordered lock at a later time. In this case ordered lock +protection is not guaranteed until the `odp_schedule_order_lock_wait()` call +returns. + += IPsec API miscellaneous changes and enhancements +IPsec support is further enhanced with the following: + +* The `odp_ipsec_ipv4_param_t` and `odp_ipsec_ipv6_param_t` structures +are added to formalize the specification of IPv4 and IPv6 options in the +`odp_ipsec_tunnel_param_t` configuration. + +* The `mode` field of the `odp_ipsec_out_t` is renamed to `frag_mode` for +better clarity. In addition the `flag.frag-mode` option bit in the +`odp_ipsec_out_opt_t` struct is defined to hold per-operation options for +the `odp_ipsec_out_param_t` struct. + +* The `odp_ipsec_capability_t` struct returned by the `odp_ipsec_capability()` +API is expanded to include the `odp_proto_chksums_t` available on inbound +IPsec traffic. This indicates whether and how inbound
Re: [lng-odp] [PATCH v1] changelog: updates for v1.18.0.0
Dmitry Eremin-Solenikov(lumag) replied on github web page: CHANGELOG line 29 @@ -1,3 +1,217 @@ +== OpenDataPlane (1.18.0.0) +=== New Features +ODP v1.18.0.0 is Tiger Moth Release Candidate 2 (RC 2). It completes the new +APIs that are part of the Tiger Moth Long Term Support (LTS) release of ODP +as well as various performance refinements and bug fixes. As of RC2 the ODP +API is now frozen for the Tiger Moth development series. + + APIs +The following new and changed APIs are included in this release: + += Addition of Shared Memory (SHM) Initialization Parameters +The `odp_init_t` struct used as the argument to the `odp_init_global()` API +has been expanded to include a `max_memory` field that specifies the maximum +amount of shared memory (shm) that the application will use. This is to +better enable ODP implementations to optimize their use of shared memory in +support of the application. If left as (or defaulted) to 0, the implementation +may choose a default limit for the application. + += Crypto Changes +A number of crypto refinements are included in this release: + +* The single initialization vector (`iv`) in the `odp_crypto_session_param_t` +is replaced by a separate `cipher_iv` and `auth_iv` fields. + +* The single initialization vector (`override_iv_ptr`) in the +`odp_crypto_op_param_t` is replaced by a separate `cipher_iv_ptr` and +`auth_iv_ptr` fields. + +* The special nature of GCM and GMAC cipher and authentication algorithms is Comment: Here and below. GCM, CCM, ChaCha20-Poly1305 are not "cipher and authentication" algorithms. They are "authenticated encryption" modes/algorithms. > Dmitry Eremin-Solenikov(lumag) wrote: > It should be noted, that there are/might be implementations that do not > provide null PktIO https://github.com/Linaro/odp/pull/500#discussion_r170590931 updated_at 2018-02-26 13:33:12
Re: [lng-odp] [PATCH v1] changelog: updates for v1.18.0.0
Dmitry Eremin-Solenikov(lumag) replied on github web page: CHANGELOG line 122 @@ -1,3 +1,217 @@ +== OpenDataPlane (1.18.0.0) +=== New Features +ODP v1.18.0.0 is Tiger Moth Release Candidate 2 (RC 2). It completes the new +APIs that are part of the Tiger Moth Long Term Support (LTS) release of ODP +as well as various performance refinements and bug fixes. As of RC2 the ODP +API is now frozen for the Tiger Moth development series. + + APIs +The following new and changed APIs are included in this release: + += Addition of Shared Memory (SHM) Initialization Parameters +The `odp_init_t` struct used as the argument to the `odp_init_global()` API +has been expanded to include a `max_memory` field that specifies the maximum +amount of shared memory (shm) that the application will use. This is to +better enable ODP implementations to optimize their use of shared memory in +support of the application. If left as (or defaulted) to 0, the implementation +may choose a default limit for the application. + += Crypto Changes +A number of crypto refinements are included in this release: + +* The single initialization vector (`iv`) in the `odp_crypto_session_param_t` +is replaced by a separate `cipher_iv` and `auth_iv` fields. + +* The single initialization vector (`override_iv_ptr`) in the +`odp_crypto_op_param_t` is replaced by a separate `cipher_iv_ptr` and +`auth_iv_ptr` fields. + +* The special nature of GCM and GMAC cipher and authentication algorithms is +clarified in that these ciphers always combine ciphering with authentication +and hence require both to be specified when used. This is simply a +documentation change as this requirement has always existed. + +* Enumerations for AES_CCM cipher (`ODP_CIPHER_ALG_AES_CCM`) and +authentication (`ODP_AUTH_ALG_AES_CCM`) modes are added. + +* Enumeration for the AES_CMAC authentication algorithm +(`ODP_AUTH_ALG_AES_CMAC`) is added. + +* Enumerations for the ChaCha20-Poly1305 cipher +(`ODP_CIPHER_ALG_CHACHA20_POLY1305`) and authentication +(`ODP_AUTH_ALG_CHACHA20_POLY1305`) modes are added. + +* Enumeration for the SHA-384 authentication algorithm +(`ODP_AUTH_ALG_SHA384_HMAC`) is added. + +* Enumeration for the AES-XCBC-MAC authentication algorithm +(`ODP_AUTH_ALG_AES_XCBC_MAC`) is added. + += Lock-free and block-free queues +The `odp_nonblocking_t` enums introduced in ODP v1.17.0.0 are now returned +as separate `odp_queue_capability()` limits for plain and scheduled queues. The +ODP reference implementations now support `ODP_NONBLOCKING_LF` queues. + += User pointer initialized to NULL +The specification for `odp_packet_user_ptr()` is clarified that unless +overridden by `odp_packet_user_ptr_set()` the value of NULL will be returned. + += Removal of `ODP_PKTIN_WAIT` option +The `ODP_PKTIN_WAIT` option on `odp_pktin_recv_tmo()` and +`odp_pktin_recv_mq_tmo()` is removed. Timeout options now consist of +`ODP_PKTIN_NO_WAIT` and a user-supplied timeout value. Since this timeout +value can be specified to be arbitrarily long, there is no need for an +indefinite wait capability as provision of such a capability proved +problematic for some ODP implementations. + += Addition of packet protocol APIs +The APIs `odp_packet_l2_type()`, `odp_packet_l3_type()`, and +`odp_packet_l4_type()` are added to return the Layer 2, 3, and 4 protocols, +respectively, associated with packets that have been parsed to the +corresponding layer. If the packet was not parsed to the associated layer +these return `ODP_PROTO_Ln_TYPE_NONE`. + += Asynchronous ordered locks +Two new APIs, `odp_schedule_order_lock_start()` and +`odp_schedule_order_lock_wait()` are added to allow for asynchronous +ordered lock acquisition in addition to the existing synchronous +`odp_schedule_order_lock()` API. In some implementations and applications, +there may be a performance advantage to indicating the intent to acquire an +ordered lock to allow the implementation to prepare for this while the +application continues parallel processing and then enter the critical section +protected by the ordered lock at a later time. In this case ordered lock +protection is not guaranteed until the `odp_schedule_order_lock_wait()` call +returns. + += IPsec API miscellaneous changes and enhancements +IPsec support is further enhanced with the following: + +* The `odp_ipsec_ipv4_param_t` and `odp_ipsec_ipv6_param_t` structures +are added to formalize the specification of IPv4 and IPv6 options in the +`odp_ipsec_tunnel_param_t` configuration. + +* The `mode` field of the `odp_ipsec_out_t` is renamed to `frag_mode` for +better clarity. In addition the `flag.frag-mode` option bit in the +`odp_ipsec_out_opt_t` struct is defined to hold per-operation options for +the `odp_ipsec_out_param_t` struct. + +* The `odp_ipsec_capability_t` struct returned by the `odp_ipsec_capability()` +API is expanded to include the `odp_proto_chksums_t` available on inbound +IPsec traffic. This indicates whether and how inbound
[lng-odp] [PATCH API-NEXT v1 8/11] linux-gen: ipsec: separate ipv4/ipv6 flags
From: Dmitry Eremin-SolenikovSigned-off-by: Dmitry Eremin-Solenikov --- /** Email created from pull request 502 (lumag:ipsec-imp-upd) ** https://github.com/Linaro/odp/pull/502 ** Patch: https://github.com/Linaro/odp/pull/502.patch ** Base sha: c91eae61d19350dd19aacf18c1148c9491398c14 ** Merge commit sha: 8c909084626ccef140542645cd34549ce7f4bcde **/ platform/linux-generic/odp_ipsec.c | 27 --- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/platform/linux-generic/odp_ipsec.c b/platform/linux-generic/odp_ipsec.c index cfdfa9dc9..55beb382e 100644 --- a/platform/linux-generic/odp_ipsec.c +++ b/platform/linux-generic/odp_ipsec.c @@ -269,6 +269,7 @@ typedef struct { uint16_t ip_next_hdr_offset; uint8_t ip_next_hdr; unsigned is_ipv4 : 1; + unsigned is_ipv6 : 1; union { struct { uint32_t ip_flabel; @@ -669,10 +670,13 @@ static ipsec_sa_t *ipsec_in_single(odp_packet_t pkt, * state.is_ipv4 = odp_packet_has_ipv4(pkt); */ state.is_ipv4 = (((uint8_t *)state.ip)[0] >> 4) == 0x4; + state.is_ipv6 = (((uint8_t *)state.ip)[0] >> 4) == 0x6; if (state.is_ipv4) rc = ipsec_parse_ipv4(, pkt); - else + else if (state.is_ipv6) rc = ipsec_parse_ipv6(, pkt); + else + rc = -1; if (rc < 0 || state.ip_tot_len + state.ip_offset > odp_packet_len(pkt)) { status->error.alg = 1; @@ -776,8 +780,10 @@ static ipsec_sa_t *ipsec_in_single(odp_packet_t pkt, state.ip_tot_len -= state.ip_hdr_len + state.in.hdr_len; if (_ODP_IPPROTO_IPIP == state.ip_next_hdr) { state.is_ipv4 = 1; + state.is_ipv6 = 0; } else if (_ODP_IPPROTO_IPV6 == state.ip_next_hdr) { state.is_ipv4 = 0; + state.is_ipv6 = 1; } else { status->error.proto = 1; goto err; @@ -802,7 +808,7 @@ static ipsec_sa_t *ipsec_in_single(odp_packet_t pkt, else ipv4hdr->ttl -= ipsec_sa->dec_ttl; _odp_ipv4_csum_update(pkt); - } else if (!state.is_ipv4 && odp_packet_len(pkt) > _ODP_IPV6HDR_LEN) { + } else if (state.is_ipv6 && odp_packet_len(pkt) > _ODP_IPV6HDR_LEN) { _odp_ipv6hdr_t *ipv6hdr = odp_packet_l3_ptr(pkt, NULL); if (ODP_IPSEC_MODE_TRANSPORT == ipsec_sa->mode) @@ -816,9 +822,9 @@ static ipsec_sa_t *ipsec_in_single(odp_packet_t pkt, goto err; } - parse_param.proto = state.is_ipv4 ? - ODP_PROTO_IPV4 : - ODP_PROTO_IPV6; + parse_param.proto = state.is_ipv4 ? ODP_PROTO_IPV4 : + state.is_ipv6 ? ODP_PROTO_IPV6 : + ODP_PROTO_NONE; parse_param.last_layer = ipsec_config.inbound.parse_level; parse_param.chksums = ipsec_config.inbound.chksums; @@ -934,6 +940,7 @@ static int ipsec_out_tunnel_ipv4(odp_packet_t *pkt, _ODP_IPV4HDR_PROTO_OFFSET; state->is_ipv4 = 1; + state->is_ipv6 = 0; return 0; } @@ -990,6 +997,7 @@ static int ipsec_out_tunnel_ipv6(odp_packet_t *pkt, state->ip_next_hdr_offset = state->ip_offset + _ODP_IPV6HDR_NHDR_OFFSET; state->is_ipv4 = 0; + state->is_ipv6 = 1; return 0; } @@ -1320,20 +1328,25 @@ static ipsec_sa_t *ipsec_out_single(odp_packet_t pkt, memset(, 0, sizeof(param)); state.is_ipv4 = (((uint8_t *)state.ip)[0] >> 4) == 0x4; + state.is_ipv6 = (((uint8_t *)state.ip)[0] >> 4) == 0x6; if (ODP_IPSEC_MODE_TRANSPORT == ipsec_sa->mode) { if (state.is_ipv4) rc = ipsec_parse_ipv4(, pkt); - else + else if (state.is_ipv6) rc = ipsec_parse_ipv6(, pkt); + else + rc = -1; if (state.ip_tot_len + state.ip_offset != odp_packet_len(pkt)) rc = -1; } else { if (state.is_ipv4) rc = ipsec_out_tunnel_parse_ipv4(, ipsec_sa); - else + else if (state.is_ipv6) rc = ipsec_out_tunnel_parse_ipv6(, ipsec_sa); + else + rc = -1; if (rc < 0) { status->error.alg = 1; goto err;
[lng-odp] [PATCH API-NEXT v1 9/11] linux-gen: ipsec: take output ip_param into account
From: Dmitry Eremin-SolenikovAllow per-packet override of IP parameters. Signed-off-by: Dmitry Eremin-Solenikov --- /** Email created from pull request 502 (lumag:ipsec-imp-upd) ** https://github.com/Linaro/odp/pull/502 ** Patch: https://github.com/Linaro/odp/pull/502.patch ** Base sha: c91eae61d19350dd19aacf18c1148c9491398c14 ** Merge commit sha: 8c909084626ccef140542645cd34549ce7f4bcde **/ .../linux-generic/include/odp_ipsec_internal.h | 9 ++--- platform/linux-generic/odp_ipsec.c | 38 ++ platform/linux-generic/odp_ipsec_sad.c | 20 3 files changed, 40 insertions(+), 27 deletions(-) diff --git a/platform/linux-generic/include/odp_ipsec_internal.h b/platform/linux-generic/include/odp_ipsec_internal.h index a449462ab..dfde4d574 100644 --- a/platform/linux-generic/include/odp_ipsec_internal.h +++ b/platform/linux-generic/include/odp_ipsec_internal.h @@ -161,22 +161,17 @@ struct ipsec_sa_s { union { struct { + odp_ipsec_ipv4_param_t param; odp_u32be_t src_ip; odp_u32be_t dst_ip; /* 32-bit from which low 16 are used */ odp_atomic_u32_t hdr_id; - - uint8_t ttl; - uint8_t dscp; - uint8_t df; } tun_ipv4; struct { + odp_ipsec_ipv6_param_t param; uint8_t src_ip[_ODP_IPV6ADDR_LEN]; uint8_t dst_ip[_ODP_IPV6ADDR_LEN]; - uint8_t hlimit; - uint8_t dscp; - uint32_tflabel; } tun_ipv6; }; } out; diff --git a/platform/linux-generic/odp_ipsec.c b/platform/linux-generic/odp_ipsec.c index 55beb382e..09a4382cd 100644 --- a/platform/linux-generic/odp_ipsec.c +++ b/platform/linux-generic/odp_ipsec.c @@ -888,7 +888,8 @@ static int ipsec_out_tunnel_parse_ipv6(ipsec_state_t *state, static int ipsec_out_tunnel_ipv4(odp_packet_t *pkt, ipsec_state_t *state, -ipsec_sa_t *ipsec_sa) +ipsec_sa_t *ipsec_sa, +const odp_ipsec_ipv4_param_t *ipv4_param) { _odp_ipv4hdr_t out_ip; uint16_t flags; @@ -899,7 +900,7 @@ static int ipsec_out_tunnel_ipv4(odp_packet_t *pkt, else out_ip.tos = (state->out_tunnel.ip_tos & ~_ODP_IP_TOS_DSCP_MASK) | -(ipsec_sa->out.tun_ipv4.dscp << +(ipv4_param->dscp << _ODP_IP_TOS_DSCP_SHIFT); state->ip_tot_len = odp_packet_len(*pkt) - state->ip_offset; state->ip_tot_len += _ODP_IPV4HDR_LEN; @@ -911,13 +912,15 @@ static int ipsec_out_tunnel_ipv4(odp_packet_t *pkt, if (ipsec_sa->copy_df) flags = state->out_tunnel.ip_df; else - flags = ((uint16_t)ipsec_sa->out.tun_ipv4.df) << 14; + flags = ((uint16_t)ipv4_param->df) << 14; out_ip.frag_offset = _odp_cpu_to_be_16(flags); - out_ip.ttl = ipsec_sa->out.tun_ipv4.ttl; + out_ip.ttl = ipv4_param->ttl; /* Will be filled later by packet checksum update */ out_ip.chksum = 0; - out_ip.src_addr = ipsec_sa->out.tun_ipv4.src_ip; - out_ip.dst_addr = ipsec_sa->out.tun_ipv4.dst_ip; + memcpy(_ip.src_addr, ipv4_param->src_addr, + _ODP_IPV4ADDR_LEN); + memcpy(_ip.dst_addr, ipv4_param->dst_addr, + _ODP_IPV4ADDR_LEN); if (odp_packet_extend_head(pkt, _ODP_IPV4HDR_LEN, NULL, NULL) < 0) @@ -947,7 +950,8 @@ static int ipsec_out_tunnel_ipv4(odp_packet_t *pkt, static int ipsec_out_tunnel_ipv6(odp_packet_t *pkt, ipsec_state_t *state, -ipsec_sa_t *ipsec_sa) +ipsec_sa_t *ipsec_sa, +const odp_ipsec_ipv6_param_t *ipv6_param) { _odp_ipv6hdr_t out_ip; uint32_t ver; @@ -958,23 +962,23 @@ static int ipsec_out_tunnel_ipv6(odp_packet_t *pkt, else ver |= ((state->out_tunnel.ip_tos & ~_ODP_IP_TOS_DSCP_MASK) | - (ipsec_sa->out.tun_ipv6.dscp << + (ipv6_param->dscp << _ODP_IP_TOS_DSCP_SHIFT)) << _ODP_IPV6HDR_TC_SHIFT; if
[lng-odp] [PATCH API-NEXT v1 11/11] validation: ipsec: inbound TFC dummy packets check
From: Dmitry Eremin-SolenikovSigned-off-by: Dmitry Eremin-Solenikov --- /** Email created from pull request 502 (lumag:ipsec-imp-upd) ** https://github.com/Linaro/odp/pull/502 ** Patch: https://github.com/Linaro/odp/pull/502.patch ** Base sha: c91eae61d19350dd19aacf18c1148c9491398c14 ** Merge commit sha: 8c909084626ccef140542645cd34549ce7f4bcde **/ test/validation/api/ipsec/ipsec_test_in.c | 43 --- test/validation/api/ipsec/test_vectors.h | 19 +- 2 files changed, 57 insertions(+), 5 deletions(-) diff --git a/test/validation/api/ipsec/ipsec_test_in.c b/test/validation/api/ipsec/ipsec_test_in.c index d02cb2438..9c1112004 100644 --- a/test/validation/api/ipsec/ipsec_test_in.c +++ b/test/validation/api/ipsec/ipsec_test_in.c @@ -1131,13 +1131,17 @@ static void test_in_ipv4_mcgrew_gcm_4_esp(void) ipsec_sa_destroy(sa); } -#if 0 static void test_in_ipv4_mcgrew_gcm_12_esp(void) { odp_ipsec_tunnel_param_t tunnel = {}; odp_ipsec_sa_param_t param; odp_ipsec_sa_t sa; + /* This test will not work properly inbound inline mode. +* Packet might be dropped and we will not check for that. */ + if (suite_context.inbound_op_mode == ODP_IPSEC_OP_MODE_INLINE) + return; + ipsec_sa_param_fill(, true, false, 0x335467ae, , ODP_CIPHER_ALG_AES_GCM, _mcgrew_gcm_12, @@ -1164,7 +1168,38 @@ static void test_in_ipv4_mcgrew_gcm_12_esp(void) ipsec_sa_destroy(sa); } -#endif + +static void test_in_ipv4_mcgrew_gcm_12_esp_notun(void) +{ + odp_ipsec_sa_param_t param; + odp_ipsec_sa_t sa; + + ipsec_sa_param_fill(, + true, false, 0x335467ae, NULL, + ODP_CIPHER_ALG_AES_GCM, _mcgrew_gcm_12, + ODP_AUTH_ALG_AES_GCM, NULL, + _mcgrew_gcm_salt_12); + + sa = odp_ipsec_sa_create(); + + CU_ASSERT_NOT_EQUAL_FATAL(ODP_IPSEC_SA_INVALID, sa); + + ipsec_test_part test = { + .pkt_in = _mcgrew_gcm_test_12_esp, + .out_pkt = 1, + .out = { + { .status.warn.all = 0, + .status.error.all = 0, + .l3_type = ODP_PROTO_L3_TYPE_IPV4, + .l4_type = ODP_PROTO_L4_TYPE_NO_NEXT, + .pkt_out = _mcgrew_gcm_test_12_notun }, + }, + }; + + ipsec_check_in_one(, sa); + + ipsec_sa_destroy(sa); +} static void test_in_ipv4_mcgrew_gcm_15_esp(void) { @@ -1584,10 +1619,10 @@ odp_testinfo_t ipsec_in_suite[] = { ipsec_check_esp_aes_gcm_256), ODP_TEST_INFO_CONDITIONAL(test_in_ipv4_mcgrew_gcm_4_esp, ipsec_check_esp_aes_gcm_128), -#if 0 ODP_TEST_INFO_CONDITIONAL(test_in_ipv4_mcgrew_gcm_12_esp, ipsec_check_esp_aes_gcm_128), -#endif + ODP_TEST_INFO_CONDITIONAL(test_in_ipv4_mcgrew_gcm_12_esp_notun, + ipsec_check_esp_aes_gcm_128), ODP_TEST_INFO_CONDITIONAL(test_in_ipv4_mcgrew_gcm_15_esp, ipsec_check_esp_null_aes_gmac_128), ODP_TEST_INFO_CONDITIONAL(test_in_ipv4_rfc7634_chacha, diff --git a/test/validation/api/ipsec/test_vectors.h b/test/validation/api/ipsec/test_vectors.h index f14fdb2b3..4d5ab3bdc 100644 --- a/test/validation/api/ipsec/test_vectors.h +++ b/test/validation/api/ipsec/test_vectors.h @@ -1641,7 +1641,7 @@ static const ipsec_test_packet pkt_mcgrew_gcm_test_4_esp = { static const ipsec_test_packet pkt_mcgrew_gcm_test_12 = { .len = 14, .l2_offset = 0, - .l3_offset = ODP_PACKET_OFFSET_INVALID, + .l3_offset = 14, .l4_offset = ODP_PACKET_OFFSET_INVALID, .data = { /* ETH - not a part of RFC, added for simplicity */ @@ -1650,6 +1650,23 @@ static const ipsec_test_packet pkt_mcgrew_gcm_test_12 = { }, }; +static const ipsec_test_packet pkt_mcgrew_gcm_test_12_notun = { + .len = 34, + .l2_offset = 0, + .l3_offset = 14, + .l4_offset = 34, + .data = { + /* ETH - not a part of RFC, added for simplicity */ + 0xf1, 0xf1, 0xf1, 0xf1, 0xf1, 0xf1, + 0xf2, 0xf2, 0xf2, 0xf2, 0xf2, 0xf2, 0x08, 0x00, + + /* IP - not a part of RFC, added for simplicity */ + 0x45, 0x00, 0x00, 0x14, 0x69, 0x8f, 0x00, 0x00, + 0x80, 0x3b, 0x4d, 0xcc, 0xc0, 0xa8, 0x01, 0x02, + 0xc0, 0xa8, 0x01, 0x01, + }, +}; + static const ipsec_test_packet pkt_mcgrew_gcm_test_12_esp = { .len = 70, .l2_offset = 0,
[lng-odp] [PATCH API-NEXT v1 10/11] linux-gen: ipsec: support inbound TFC dummy packets
From: Dmitry Eremin-SolenikovSigned-off-by: Dmitry Eremin-Solenikov --- /** Email created from pull request 502 (lumag:ipsec-imp-upd) ** https://github.com/Linaro/odp/pull/502 ** Patch: https://github.com/Linaro/odp/pull/502.patch ** Base sha: c91eae61d19350dd19aacf18c1148c9491398c14 ** Merge commit sha: 8c909084626ccef140542645cd34549ce7f4bcde **/ platform/linux-generic/odp_ipsec.c | 32 ++-- 1 file changed, 22 insertions(+), 10 deletions(-) diff --git a/platform/linux-generic/odp_ipsec.c b/platform/linux-generic/odp_ipsec.c index 09a4382cd..37226cede 100644 --- a/platform/linux-generic/odp_ipsec.c +++ b/platform/linux-generic/odp_ipsec.c @@ -651,7 +651,6 @@ static ipsec_sa_t *ipsec_in_single(odp_packet_t pkt, odp_crypto_packet_op_param_t param; int rc; odp_crypto_packet_result_t crypto; /**< Crypto operation result */ - odp_packet_parse_param_t parse_param; odp_packet_hdr_t *pkt_hdr; state.ip_offset = odp_packet_l3_offset(pkt); @@ -784,6 +783,9 @@ static ipsec_sa_t *ipsec_in_single(odp_packet_t pkt, } else if (_ODP_IPPROTO_IPV6 == state.ip_next_hdr) { state.is_ipv4 = 0; state.is_ipv6 = 1; + } else if (_ODP_IPPROTO_NO_NEXT == state.ip_next_hdr) { + state.is_ipv4 = 0; + state.is_ipv6 = 0; } else { status->error.proto = 1; goto err; @@ -817,20 +819,30 @@ static ipsec_sa_t *ipsec_in_single(odp_packet_t pkt, _ODP_IPV6HDR_LEN); else ipv6hdr->hop_limit -= ipsec_sa->dec_ttl; - } else { + } else if (state.ip_next_hdr != _ODP_IPPROTO_NO_NEXT) { status->error.proto = 1; goto err; } - parse_param.proto = state.is_ipv4 ? ODP_PROTO_IPV4 : - state.is_ipv6 ? ODP_PROTO_IPV6 : - ODP_PROTO_NONE; - parse_param.last_layer = ipsec_config.inbound.parse_level; - parse_param.chksums = ipsec_config.inbound.chksums; + if (_ODP_IPPROTO_NO_NEXT == state.ip_next_hdr && + ODP_IPSEC_MODE_TUNNEL == ipsec_sa->mode) { + odp_packet_hdr_t *pkt_hdr = packet_hdr(pkt); - /* We do not care about return code here. -* Parsing error should not result in IPsec error. */ - odp_packet_parse(pkt, state.ip_offset, _param); + packet_parse_reset(pkt_hdr); + pkt_hdr->p.l3_offset = state.ip_offset; + } else { + odp_packet_parse_param_t parse_param; + + parse_param.proto = state.is_ipv4 ? ODP_PROTO_IPV4 : + state.is_ipv6 ? ODP_PROTO_IPV6 : + ODP_PROTO_NONE; + parse_param.last_layer = ipsec_config.inbound.parse_level; + parse_param.chksums = ipsec_config.inbound.chksums; + + /* We do not care about return code here. +* Parsing error should not result in IPsec error. */ + odp_packet_parse(pkt, state.ip_offset, _param); + } *pkt_out = pkt;
[lng-odp] [PATCH API-NEXT v1 2/11] validation: packet: verify odp_packet_l2_type()
From: Dmitry Eremin-SolenikovSigned-off-by: Dmitry Eremin-Solenikov --- /** Email created from pull request 502 (lumag:ipsec-imp-upd) ** https://github.com/Linaro/odp/pull/502 ** Patch: https://github.com/Linaro/odp/pull/502.patch ** Base sha: c91eae61d19350dd19aacf18c1148c9491398c14 ** Merge commit sha: 8c909084626ccef140542645cd34549ce7f4bcde **/ test/validation/api/packet/packet.c | 8 1 file changed, 8 insertions(+) diff --git a/test/validation/api/packet/packet.c b/test/validation/api/packet/packet.c index f829d0cb1..91e69204e 100644 --- a/test/validation/api/packet/packet.c +++ b/test/validation/api/packet/packet.c @@ -2552,6 +2552,8 @@ void packet_test_parse(void) CU_ASSERT(odp_packet_has_udp(pkt[i])); CU_ASSERT(!odp_packet_has_ipv6(pkt[i])); CU_ASSERT(!odp_packet_has_tcp(pkt[i])); + CU_ASSERT_EQUAL(odp_packet_l2_type(pkt[i]), + ODP_PROTO_L2_TYPE_ETH); CU_ASSERT_EQUAL(odp_packet_l3_type(pkt[i]), ODP_PROTO_L3_TYPE_IPV4); CU_ASSERT_EQUAL(odp_packet_l4_type(pkt[i]), @@ -2620,6 +2622,8 @@ void packet_test_parse(void) CU_ASSERT(odp_packet_has_tcp(pkt[i])); CU_ASSERT(!odp_packet_has_ipv6(pkt[i])); CU_ASSERT(!odp_packet_has_udp(pkt[i])); + CU_ASSERT_EQUAL(odp_packet_l2_type(pkt[i]), + ODP_PROTO_L2_TYPE_ETH); CU_ASSERT_EQUAL(odp_packet_l3_type(pkt[i]), ODP_PROTO_L3_TYPE_IPV4); CU_ASSERT_EQUAL(odp_packet_l4_type(pkt[i]), @@ -2655,6 +2659,8 @@ void packet_test_parse(void) CU_ASSERT(odp_packet_has_udp(pkt[i])); CU_ASSERT(!odp_packet_has_ipv4(pkt[i])); CU_ASSERT(!odp_packet_has_tcp(pkt[i])); + CU_ASSERT_EQUAL(odp_packet_l2_type(pkt[i]), + ODP_PROTO_L2_TYPE_ETH); CU_ASSERT_EQUAL(odp_packet_l3_type(pkt[i]), ODP_PROTO_L3_TYPE_IPV6); CU_ASSERT_EQUAL(odp_packet_l4_type(pkt[i]), @@ -2690,6 +2696,8 @@ void packet_test_parse(void) CU_ASSERT(odp_packet_has_tcp(pkt[i])); CU_ASSERT(!odp_packet_has_ipv4(pkt[i])); CU_ASSERT(!odp_packet_has_udp(pkt[i])); + CU_ASSERT_EQUAL(odp_packet_l2_type(pkt[i]), + ODP_PROTO_L2_TYPE_ETH); CU_ASSERT_EQUAL(odp_packet_l3_type(pkt[i]), ODP_PROTO_L3_TYPE_IPV6); CU_ASSERT_EQUAL(odp_packet_l4_type(pkt[i]),
[lng-odp] [PATCH API-NEXT v1 7/11] validation: ipsec: add L3/L4 types validation
From: Dmitry Eremin-SolenikovSigned-off-by: Dmitry Eremin-Solenikov --- /** Email created from pull request 502 (lumag:ipsec-imp-upd) ** https://github.com/Linaro/odp/pull/502 ** Patch: https://github.com/Linaro/odp/pull/502.patch ** Base sha: c91eae61d19350dd19aacf18c1148c9491398c14 ** Merge commit sha: 8c909084626ccef140542645cd34549ce7f4bcde **/ test/validation/api/ipsec/ipsec.c | 8 +++ test/validation/api/ipsec/ipsec.h | 5 ++ test/validation/api/ipsec/ipsec_test_in.c | 85 ++ test/validation/api/ipsec/ipsec_test_out.c | 10 4 files changed, 108 insertions(+) diff --git a/test/validation/api/ipsec/ipsec.c b/test/validation/api/ipsec/ipsec.c index 31bd557fe..6043debf9 100644 --- a/test/validation/api/ipsec/ipsec.c +++ b/test/validation/api/ipsec/ipsec.c @@ -689,6 +689,14 @@ void ipsec_check_in_one(const ipsec_test_part *part, odp_ipsec_sa_t sa) } ipsec_check_packet(part->out[i].pkt_out, pkto[i]); + if (part->out[i].pkt_out != NULL && + part->out[i].l3_type != _ODP_PROTO_L3_TYPE_UNDEF) + CU_ASSERT_EQUAL(part->out[i].l3_type, + odp_packet_l3_type(pkto[i])); + if (part->out[i].pkt_out != NULL && + part->out[i].l4_type != _ODP_PROTO_L4_TYPE_UNDEF) + CU_ASSERT_EQUAL(part->out[i].l4_type, + odp_packet_l4_type(pkto[i])); odp_packet_free(pkto[i]); } } diff --git a/test/validation/api/ipsec/ipsec.h b/test/validation/api/ipsec/ipsec.h index 7ba9ef10e..b2d6df698 100644 --- a/test/validation/api/ipsec/ipsec.h +++ b/test/validation/api/ipsec/ipsec.h @@ -42,6 +42,9 @@ typedef struct { uint8_t data[256]; } ipsec_test_packet; +#define _ODP_PROTO_L3_TYPE_UNDEF ((odp_proto_l3_type_t)-1) +#define _ODP_PROTO_L4_TYPE_UNDEF ((odp_proto_l4_type_t)-1) + typedef struct { const ipsec_test_packet *pkt_in; odp_bool_t lookup; @@ -51,6 +54,8 @@ typedef struct { struct { odp_ipsec_op_status_t status; const ipsec_test_packet *pkt_out; + odp_proto_l3_type_t l3_type; + odp_proto_l4_type_t l4_type; } out[1]; } ipsec_test_part; diff --git a/test/validation/api/ipsec/ipsec_test_in.c b/test/validation/api/ipsec/ipsec_test_in.c index 8138defb5..d02cb2438 100644 --- a/test/validation/api/ipsec/ipsec_test_in.c +++ b/test/validation/api/ipsec/ipsec_test_in.c @@ -31,6 +31,8 @@ static void test_in_ipv4_ah_sha256(void) .out = { { .status.warn.all = 0, .status.error.all = 0, + .l3_type = ODP_PROTO_L3_TYPE_IPV4, + .l4_type = ODP_PROTO_L4_TYPE_ICMPV4, .pkt_out = _ipv4_icmp_0 }, }, }; @@ -62,6 +64,8 @@ static void test_in_ipv4_ah_sha256_tun_ipv4(void) .out = { { .status.warn.all = 0, .status.error.all = 0, + .l3_type = ODP_PROTO_L3_TYPE_IPV4, + .l4_type = ODP_PROTO_L4_TYPE_ICMPV4, .pkt_out = _ipv4_icmp_0 }, }, }; @@ -93,6 +97,8 @@ static void test_in_ipv4_ah_sha256_tun_ipv6(void) .out = { { .status.warn.all = 0, .status.error.all = 0, + .l3_type = ODP_PROTO_L3_TYPE_IPV4, + .l4_type = ODP_PROTO_L4_TYPE_ICMPV4, .pkt_out = _ipv4_icmp_0 }, }, }; @@ -123,6 +129,9 @@ static void test_in_ipv4_ah_sha256_tun_ipv4_notun(void) .out = { { .status.warn.all = 0, .status.error.all = 0, + .l3_type = ODP_PROTO_L3_TYPE_IPV4, + /* It is L4_TYPE_IPV4 */ + .l4_type = _ODP_PROTO_L4_TYPE_UNDEF, .pkt_out = _ipv4_icmp_0_ipip }, }, }; @@ -153,6 +162,8 @@ static void test_in_ipv4_esp_null_sha256(void) .out = { { .status.warn.all = 0, .status.error.all = 0, + .l3_type = ODP_PROTO_L3_TYPE_IPV4, + .l4_type = ODP_PROTO_L4_TYPE_ICMPV4, .pkt_out = _ipv4_icmp_0 }, }, }; @@ -183,6 +194,8 @@ static void test_in_ipv4_esp_aes_cbc_null(void) .out = { { .status.warn.all = 0, .status.error.all = 0, + .l3_type = ODP_PROTO_L3_TYPE_IPV4, +
[lng-odp] [PATCH API-NEXT v1 6/11] linux-gen: packet: support L4 type No Next Header
From: Dmitry Eremin-SolenikovSigned-off-by: Dmitry Eremin-Solenikov --- /** Email created from pull request 502 (lumag:ipsec-imp-upd) ** https://github.com/Linaro/odp/pull/502 ** Patch: https://github.com/Linaro/odp/pull/502.patch ** Base sha: c91eae61d19350dd19aacf18c1148c9491398c14 ** Merge commit sha: 8c909084626ccef140542645cd34549ce7f4bcde **/ platform/linux-generic/include/odp/api/plat/packet_inline_types.h | 1 + platform/linux-generic/include/protocols/ip.h | 1 + platform/linux-generic/odp_packet.c | 6 ++ 3 files changed, 8 insertions(+) diff --git a/platform/linux-generic/include/odp/api/plat/packet_inline_types.h b/platform/linux-generic/include/odp/api/plat/packet_inline_types.h index 2e8efecb6..4b88d3517 100644 --- a/platform/linux-generic/include/odp/api/plat/packet_inline_types.h +++ b/platform/linux-generic/include/odp/api/plat/packet_inline_types.h @@ -90,6 +90,7 @@ typedef union { uint64_t tcp:1; /* TCP */ uint64_t sctp:1; /* SCTP */ uint64_t icmp:1; /* ICMP */ + uint64_t no_next_hdr:1; /* No Next Header */ uint64_t color:2; /* Packet color for traffic mgmt */ uint64_t nodrop:1;/* Drop eligibility status */ diff --git a/platform/linux-generic/include/protocols/ip.h b/platform/linux-generic/include/protocols/ip.h index f02980693..19aef3dcc 100644 --- a/platform/linux-generic/include/protocols/ip.h +++ b/platform/linux-generic/include/protocols/ip.h @@ -167,6 +167,7 @@ typedef struct ODP_PACKED { #define _ODP_IPPROTO_AH 0x33 /**< Authentication Header (51) */ #define _ODP_IPPROTO_ESP 0x32 /**< Encapsulating Security Payload (50) */ #define _ODP_IPPROTO_ICMPV6 0x3A /**< Internet Control Message Protocol (58) */ +#define _ODP_IPPROTO_NO_NEXT 0x3B /**< No Next Header (59) */ #define _ODP_IPPROTO_DEST0x3C /**< IPv6 Destination header (60) */ #define _ODP_IPPROTO_SCTP0x84 /**< Stream Control Transmission protocol (132) */ diff --git a/platform/linux-generic/odp_packet.c b/platform/linux-generic/odp_packet.c index 6fc5f2206..7cbf1b9ef 100644 --- a/platform/linux-generic/odp_packet.c +++ b/platform/linux-generic/odp_packet.c @@ -2229,6 +2229,10 @@ int packet_parse_common_l3_l4(packet_parser_t *prs, const uint8_t *parseptr, prs->input_flags.sctp = 1; break; + case _ODP_IPPROTO_NO_NEXT: + prs->input_flags.no_next_hdr = 1; + break; + default: prs->input_flags.l4 = 0; break; @@ -2550,6 +2554,8 @@ odp_proto_l4_type_t odp_packet_l4_type(odp_packet_t pkt) else if (pkt_hdr->p.input_flags.icmp && pkt_hdr->p.input_flags.ipv6) return ODP_PROTO_L4_TYPE_ICMPV6; + else if (pkt_hdr->p.input_flags.no_next_hdr) + return ODP_PROTO_L4_TYPE_NO_NEXT; return ODP_PROTO_L4_TYPE_NONE; }
[lng-odp] [PATCH API-NEXT v1 4/11] linux-gen: ipsec: provide global init/term functions
From: Dmitry Eremin-SolenikovSigned-off-by: Dmitry Eremin-Solenikov --- /** Email created from pull request 502 (lumag:ipsec-imp-upd) ** https://github.com/Linaro/odp/pull/502 ** Patch: https://github.com/Linaro/odp/pull/502.patch ** Base sha: c91eae61d19350dd19aacf18c1148c9491398c14 ** Merge commit sha: 8c909084626ccef140542645cd34549ce7f4bcde **/ platform/linux-generic/include/odp_internal.h | 4 platform/linux-generic/odp_init.c | 13 + platform/linux-generic/odp_ipsec.c| 13 + 3 files changed, 30 insertions(+) diff --git a/platform/linux-generic/include/odp_internal.h b/platform/linux-generic/include/odp_internal.h index 444e1163b..fcf7d1109 100644 --- a/platform/linux-generic/include/odp_internal.h +++ b/platform/linux-generic/include/odp_internal.h @@ -76,6 +76,7 @@ enum init_stage { NAME_TABLE_INIT, IPSEC_EVENTS_INIT, IPSEC_SAD_INIT, + IPSEC_INIT, ALL_INIT /* All init stages completed */ }; @@ -136,6 +137,9 @@ int _odp_ishm_init_local(void); int _odp_ishm_term_global(void); int _odp_ishm_term_local(void); +int _odp_ipsec_init_global(void); +int _odp_ipsec_term_global(void); + int _odp_ipsec_sad_init_global(void); int _odp_ipsec_sad_term_global(void); diff --git a/platform/linux-generic/odp_init.c b/platform/linux-generic/odp_init.c index a2d9d52ff..0da1a5d11 100644 --- a/platform/linux-generic/odp_init.c +++ b/platform/linux-generic/odp_init.c @@ -150,6 +150,12 @@ int odp_init_global(odp_instance_t *instance, } stage = IPSEC_SAD_INIT; + if (_odp_ipsec_init_global()) { + ODP_ERR("ODP IPsec init failed.\n"); + goto init_failed; + } + stage = IPSEC_INIT; + *instance = (odp_instance_t)odp_global_data.main_pid; return 0; @@ -174,6 +180,13 @@ int _odp_term_global(enum init_stage stage) switch (stage) { case ALL_INIT: + case IPSEC_INIT: + if (_odp_ipsec_term_global()) { + ODP_ERR("ODP IPsec term failed.\n"); + rc = -1; + } + /* Fall through */ + case IPSEC_SAD_INIT: if (_odp_ipsec_sad_term_global()) { ODP_ERR("ODP IPsec SAD term failed.\n"); diff --git a/platform/linux-generic/odp_ipsec.c b/platform/linux-generic/odp_ipsec.c index 3e6a80987..8c3d6cd63 100644 --- a/platform/linux-generic/odp_ipsec.c +++ b/platform/linux-generic/odp_ipsec.c @@ -1796,3 +1796,16 @@ odp_event_t odp_ipsec_packet_to_event(odp_packet_t pkt) { return odp_packet_to_event(pkt); } + +int _odp_ipsec_init_global(void) +{ + odp_ipsec_config_init(_config); + + return 0; +} + +int _odp_ipsec_term_global(void) +{ + /* Do nothing for now */ + return 0; +}
[lng-odp] [PATCH API-NEXT v1 3/11] validation: ipsec: set frag_mode flag
From: Dmitry Eremin-SolenikovSigned-off-by: Dmitry Eremin-Solenikov --- /** Email created from pull request 502 (lumag:ipsec-imp-upd) ** https://github.com/Linaro/odp/pull/502 ** Patch: https://github.com/Linaro/odp/pull/502.patch ** Base sha: c91eae61d19350dd19aacf18c1148c9491398c14 ** Merge commit sha: 8c909084626ccef140542645cd34549ce7f4bcde **/ test/validation/api/ipsec/ipsec_test_out.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/test/validation/api/ipsec/ipsec_test_out.c b/test/validation/api/ipsec/ipsec_test_out.c index 2850ddfa4..911987388 100644 --- a/test/validation/api/ipsec/ipsec_test_out.c +++ b/test/validation/api/ipsec/ipsec_test_out.c @@ -572,7 +572,8 @@ static void test_out_ipv4_ah_sha256_frag_check(void) ipsec_test_part test2 = { .pkt_in = _ipv4_icmp_0, .num_opt = 1, - .opt = { .frag_mode = ODP_IPSEC_FRAG_DISABLED, }, + .opt = { .flag.frag_mode = 1, +.frag_mode = ODP_IPSEC_FRAG_DISABLED, }, .out_pkt = 1, .out = { { .status.warn.all = 0, @@ -665,7 +666,8 @@ static void test_out_ipv4_esp_null_sha256_frag_check(void) ipsec_test_part test2 = { .pkt_in = _ipv4_icmp_0, .num_opt = 1, - .opt = { .frag_mode = ODP_IPSEC_FRAG_DISABLED, }, + .opt = { .flag.frag_mode = 1, +.frag_mode = ODP_IPSEC_FRAG_DISABLED, }, .out_pkt = 1, .out = { { .status.warn.all = 0,
[lng-odp] [PATCH API-NEXT v1 5/11] linux-gen: ipsec: take ipsec_out_opt flags into account
From: Dmitry Eremin-SolenikovOnly override frag_mode if respective flag is set. Signed-off-by: Dmitry Eremin-Solenikov --- /** Email created from pull request 502 (lumag:ipsec-imp-upd) ** https://github.com/Linaro/odp/pull/502 ** Patch: https://github.com/Linaro/odp/pull/502.patch ** Base sha: c91eae61d19350dd19aacf18c1148c9491398c14 ** Merge commit sha: 8c909084626ccef140542645cd34549ce7f4bcde **/ platform/linux-generic/odp_ipsec.c | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/platform/linux-generic/odp_ipsec.c b/platform/linux-generic/odp_ipsec.c index 8c3d6cd63..cfdfa9dc9 100644 --- a/platform/linux-generic/odp_ipsec.c +++ b/platform/linux-generic/odp_ipsec.c @@ -1296,6 +1296,7 @@ static ipsec_sa_t *ipsec_out_single(odp_packet_t pkt, int rc; odp_crypto_packet_result_t crypto; /**< Crypto operation result */ odp_packet_hdr_t *pkt_hdr; + odp_ipsec_frag_mode_t frag_mode; uint32_t mtu; state.ip_offset = odp_packet_l3_offset(pkt); @@ -1307,8 +1308,10 @@ static ipsec_sa_t *ipsec_out_single(odp_packet_t pkt, ipsec_sa = _odp_ipsec_sa_use(sa); ODP_ASSERT(NULL != ipsec_sa); - if ((opt && opt->frag_mode == ODP_IPSEC_FRAG_CHECK) || - (!opt && ipsec_sa->out.frag_mode == ODP_IPSEC_FRAG_CHECK)) + frag_mode = ipsec_sa->out.frag_mode; + if (opt->flag.frag_mode) + frag_mode = opt->frag_mode; + if (frag_mode == ODP_IPSEC_FRAG_CHECK) mtu = ipsec_sa->out.mtu; else mtu = UINT32_MAX; @@ -1467,6 +1470,8 @@ int odp_ipsec_in(const odp_packet_t pkt_in[], int num_in, return in_pkt; } +static odp_ipsec_out_opt_t default_out_opt; + int odp_ipsec_out(const odp_packet_t pkt_in[], int num_in, odp_packet_t pkt_out[], int *num_out, const odp_ipsec_out_param_t *param) @@ -1495,7 +1500,7 @@ int odp_ipsec_out(const odp_packet_t pkt_in[], int num_in, ODP_ASSERT(ODP_IPSEC_SA_INVALID != sa); if (0 == param->num_opt) - opt = NULL; + opt = _out_opt; else opt = >opt[opt_idx]; @@ -1602,7 +1607,7 @@ int odp_ipsec_out_enq(const odp_packet_t pkt_in[], int num_in, ODP_ASSERT(ODP_IPSEC_SA_INVALID != sa); if (0 == param->num_opt) - opt = NULL; + opt = _out_opt; else opt = >opt[opt_idx]; @@ -1697,7 +1702,7 @@ int odp_ipsec_out_inline(const odp_packet_t pkt_in[], int num_in, } if (0 == param->num_opt) - opt = NULL; + opt = _out_opt; else opt = >opt[opt_idx]; @@ -1801,6 +1806,8 @@ int _odp_ipsec_init_global(void) { odp_ipsec_config_init(_config); + memset(_out_opt, 0, sizeof(default_out_opt)); + return 0; }
[lng-odp] [PATCH API-NEXT v1 0/11] IPsec implementation update
github /** Email created from pull request 502 (lumag:ipsec-imp-upd) ** https://github.com/Linaro/odp/pull/502 ** Patch: https://github.com/Linaro/odp/pull/502.patch ** Base sha: c91eae61d19350dd19aacf18c1148c9491398c14 ** Merge commit sha: 8c909084626ccef140542645cd34549ce7f4bcde **/ /github checkpatch.pl total: 0 errors, 0 warnings, 0 checks, 16 lines checked to_send-p-000.patch has no obvious style problems and is ready for submission. total: 0 errors, 0 warnings, 0 checks, 32 lines checked to_send-p-001.patch has no obvious style problems and is ready for submission. total: 0 errors, 0 warnings, 0 checks, 18 lines checked to_send-p-002.patch has no obvious style problems and is ready for submission. total: 0 errors, 0 warnings, 0 checks, 57 lines checked to_send-p-003.patch has no obvious style problems and is ready for submission. total: 0 errors, 0 warnings, 0 checks, 59 lines checked to_send-p-004.patch has no obvious style problems and is ready for submission. total: 0 errors, 0 warnings, 0 checks, 32 lines checked to_send-p-005.patch has no obvious style problems and is ready for submission. total: 0 errors, 0 warnings, 0 checks, 408 lines checked to_send-p-006.patch has no obvious style problems and is ready for submission. total: 0 errors, 0 warnings, 0 checks, 92 lines checked to_send-p-007.patch has no obvious style problems and is ready for submission. WARNING: line over 80 characters #141: FILE: platform/linux-generic/odp_ipsec.c:1363: + _sa->out.tun_ipv4.param); WARNING: line over 80 characters #147: FILE: platform/linux-generic/odp_ipsec.c:1368: + _sa->out.tun_ipv6.param); total: 0 errors, 2 warnings, 0 checks, 150 lines checked to_send-p-008.patch has style problems, please review. If any of these errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. total: 0 errors, 0 warnings, 0 checks, 55 lines checked to_send-p-009.patch has no obvious style problems and is ready for submission. total: 0 errors, 0 warnings, 0 checks, 100 lines checked to_send-p-010.patch has no obvious style problems and is ready for submission. /checkpatch.pl
[lng-odp] [PATCH API-NEXT v1 1/11] linux-gen: packet: add odp_packet_l2_type() implementation
From: Dmitry Eremin-SolenikovSigned-off-by: Dmitry Eremin-Solenikov --- /** Email created from pull request 502 (lumag:ipsec-imp-upd) ** https://github.com/Linaro/odp/pull/502 ** Patch: https://github.com/Linaro/odp/pull/502.patch ** Base sha: c91eae61d19350dd19aacf18c1148c9491398c14 ** Merge commit sha: 8c909084626ccef140542645cd34549ce7f4bcde **/ platform/linux-generic/odp_packet.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/platform/linux-generic/odp_packet.c b/platform/linux-generic/odp_packet.c index 46b11cba1..6fc5f2206 100644 --- a/platform/linux-generic/odp_packet.c +++ b/platform/linux-generic/odp_packet.c @@ -2506,6 +2506,16 @@ int odp_packet_has_ref(odp_packet_t pkt) return 0; } +odp_proto_l2_type_t odp_packet_l2_type(odp_packet_t pkt) +{ + odp_packet_hdr_t *pkt_hdr = packet_hdr(pkt); + + if (pkt_hdr->p.input_flags.eth) + return ODP_PROTO_L2_TYPE_ETH; + + return ODP_PROTO_L2_TYPE_NONE; +} + odp_proto_l3_type_t odp_packet_l3_type(odp_packet_t pkt) { odp_packet_hdr_t *pkt_hdr = packet_hdr(pkt);
[lng-odp] [PATCH API-NEXT v1 1/1] update Linaro Copyrights to 2018 year part2
From: Maxim Uvarovupdate Copyrights with the same script in rebased branch. Signed-off-by: Maxim Uvarov --- /** Email created from pull request 501 (muvarov:devel/api_next_copyrights2) ** https://github.com/Linaro/odp/pull/501 ** Patch: https://github.com/Linaro/odp/pull/501.patch ** Base sha: c91eae61d19350dd19aacf18c1148c9491398c14 ** Merge commit sha: 2fa6a554aea6ad85113626b64a5be240458c653d **/ include/odp/api/abi-default/atomic.h| 2 +- include/odp/api/abi-default/byteorder.h | 2 +- include/odp/api/abi-default/debug.h | 2 +- include/odp/api/abi-default/ipsec.h | 2 +- include/odp/api/abi-default/packet_flags.h | 2 +- include/odp/api/abi-default/packet_io.h | 2 +- include/odp/api/abi-default/schedule.h | 2 +- include/odp/api/abi-default/sync.h | 2 +- include/odp/api/abi-default/timer.h | 2 +- include/odp/api/align.h | 2 +- include/odp/api/atomic.h| 2 +- include/odp/api/classification.h| 2 +- include/odp/api/debug.h | 2 +- include/odp/api/packet.h| 2 +- include/odp/api/queue.h | 2 +- include/odp/api/spec/thread_types.h | 2 +- include/odp/api/std_types.h | 2 +- include/odp/api/sync.h | 2 +- include/odp/arch/arm32-linux/odp/api/abi/byteorder.h| 2 +- include/odp/arch/arm32-linux/odp/api/abi/cpumask.h | 2 +- include/odp/arch/arm32-linux/odp/api/abi/debug.h| 2 +- include/odp/arch/arm32-linux/odp/api/abi/init.h | 2 +- include/odp/arch/arm32-linux/odp/api/abi/ipsec.h| 2 +- include/odp/arch/arm32-linux/odp/api/abi/packet_flags.h | 2 +- include/odp/arch/arm32-linux/odp/api/abi/packet_io.h| 2 +- include/odp/arch/arm32-linux/odp/api/abi/rwlock.h | 2 +- include/odp/arch/arm32-linux/odp/api/abi/rwlock_recursive.h | 2 +- include/odp/arch/arm32-linux/odp/api/abi/schedule.h | 2 +- include/odp/arch/arm32-linux/odp/api/abi/schedule_types.h | 2 +- include/odp/arch/arm32-linux/odp/api/abi/spinlock.h | 2 +- include/odp/arch/arm32-linux/odp/api/abi/spinlock_recursive.h | 2 +- include/odp/arch/arm32-linux/odp/api/abi/std_clib.h | 2 +- include/odp/arch/arm32-linux/odp/api/abi/std_types.h| 2 +- include/odp/arch/arm32-linux/odp/api/abi/sync.h | 2 +- include/odp/arch/arm32-linux/odp/api/abi/thread.h | 2 +- include/odp/arch/arm32-linux/odp/api/abi/thrmask.h | 2 +- include/odp/arch/arm32-linux/odp/api/abi/ticketlock.h | 2 +- include/odp/arch/arm32-linux/odp/api/abi/time.h | 2 +- include/odp/arch/arm32-linux/odp/api/abi/timer.h| 2 +- include/odp/arch/arm32-linux/odp/api/abi/traffic_mngr.h | 2 +- include/odp/arch/arm32-linux/odp/api/abi/version.h | 2 +- include/odp/arch/arm64-linux/odp/api/abi/align.h| 2 +- include/odp/arch/arm64-linux/odp/api/abi/atomic.h | 2 +- include/odp/arch/arm64-linux/odp/api/abi/barrier.h | 2 +- include/odp/arch/arm64-linux/odp/api/abi/byteorder.h| 2 +- include/odp/arch/arm64-linux/odp/api/abi/cpumask.h | 2 +- include/odp/arch/arm64-linux/odp/api/abi/debug.h| 2 +- include/odp/arch/arm64-linux/odp/api/abi/init.h | 2 +- include/odp/arch/arm64-linux/odp/api/abi/ipsec.h| 2 +- include/odp/arch/arm64-linux/odp/api/abi/packet_flags.h | 2 +- include/odp/arch/arm64-linux/odp/api/abi/packet_io.h| 2 +- include/odp/arch/arm64-linux/odp/api/abi/rwlock.h | 2 +- include/odp/arch/arm64-linux/odp/api/abi/rwlock_recursive.h | 2 +- include/odp/arch/arm64-linux/odp/api/abi/schedule.h | 2 +- include/odp/arch/arm64-linux/odp/api/abi/schedule_types.h | 2 +- include/odp/arch/arm64-linux/odp/api/abi/spinlock.h | 2 +- include/odp/arch/arm64-linux/odp/api/abi/spinlock_recursive.h | 2 +- include/odp/arch/arm64-linux/odp/api/abi/std_clib.h | 2 +- include/odp/arch/arm64-linux/odp/api/abi/std_types.h| 2 +- include/odp/arch/arm64-linux/odp/api/abi/sync.h
[lng-odp] [PATCH API-NEXT v1 0/1] update Linaro Copyrights to 2018 year part2
update Copyrights with the same script in rebased branch. Signed-off-by: Maxim Uvarov maxim.uva...@linaro.org github /** Email created from pull request 501 (muvarov:devel/api_next_copyrights2) ** https://github.com/Linaro/odp/pull/501 ** Patch: https://github.com/Linaro/odp/pull/501.patch ** Base sha: c91eae61d19350dd19aacf18c1148c9491398c14 ** Merge commit sha: 2fa6a554aea6ad85113626b64a5be240458c653d **/ /github checkpatch.pl total: 0 errors, 0 warnings, 0 checks, 1247 lines checked to_send-p-000.patch has no obvious style problems and is ready for submission. /checkpatch.pl
[lng-odp] [PATCH v1 0/1] changelog: updates for v1.18.0.0
Add updates for v1.18.0.0 (Tiger Moth RC2) Signed-off-by: Bill Fischofer bill.fischo...@linaro.org github /** Email created from pull request 500 (Bill-Fischofer-Linaro:v1.18-changelog) ** https://github.com/Linaro/odp/pull/500 ** Patch: https://github.com/Linaro/odp/pull/500.patch ** Base sha: ba28192c7622cb924897c0fe0649a33b92fc4a01 ** Merge commit sha: 02d00931217262cd5af9c5560bd571033f99ab45 **/ /github checkpatch.pl total: 0 errors, 0 warnings, 0 checks, 217 lines checked to_send-p-000.patch has no obvious style problems and is ready for submission. /checkpatch.pl
[lng-odp] [PATCH v1 0/1] linux-gen: add runtime configuration file
Enables changing ODP runtime configuration options by using an optional configuration file (libconfig). Path to the conf file is passed using environment variable ODP_CONF_FILE. If ODP_CONF_FILE or a particular option is not set, hardcoded default values are used intead. An example configuration file is provided in config/odp-linux.conf. Runtime configuration is initially used by DPDK pktio to set NIC options. Adds new dependency to libconfig library. Signed-off-by: Matias Elo matias@nokia.com github /** Email created from pull request 499 (matiaselo:dev/dpdk_dev_config) ** https://github.com/Linaro/odp/pull/499 ** Patch: https://github.com/Linaro/odp/pull/499.patch ** Base sha: 5a58bbf2bb331fd7dde2ebbc0430634ace6900fb ** Merge commit sha: ab13bcecea3972b5af189e9e5d6d4873790fb554 **/ /github checkpatch.pl WARNING: 'intead' may be misspelled - perhaps 'instead'? #10: is not set, hardcoded default values are used intead. An example WARNING: externs should be avoided in .c files #394: FILE: platform/linux-generic/odp_libconfig.c:16: +extern struct odp_global_data_s odp_global_data; CHECK: Avoid CamelCase: #503: FILE: platform/linux-generic/pktio/dpdk.c:141: + printf("DPDK interface (%s): %" PRIu16 "\n", dev_info->driver_name, total: 0 errors, 2 warnings, 1 checks, 420 lines checked to_send-p-000.patch has style problems, please review. If any of these errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. /checkpatch.pl
[lng-odp] [PATCH v7 2/2] validation: pool: verify pool data range
From: Michal MazurAllocate maximum number of packets from pool and verify that packet data are located inside range returned by odp_pool_info. Signed-off-by: Michal Mazur --- /** Email created from pull request 495 (semihalf-mazur-michal:master) ** https://github.com/Linaro/odp/pull/495 ** Patch: https://github.com/Linaro/odp/pull/495.patch ** Base sha: 5a58bbf2bb331fd7dde2ebbc0430634ace6900fb ** Merge commit sha: 79aeba092a0c85e26786ff8efbaeb71608ae1fa3 **/ test/validation/api/pool/pool.c | 54 + 1 file changed, 54 insertions(+) diff --git a/test/validation/api/pool/pool.c b/test/validation/api/pool/pool.c index 34f973573..394c79497 100644 --- a/test/validation/api/pool/pool.c +++ b/test/validation/api/pool/pool.c @@ -217,6 +217,59 @@ static void pool_test_info_packet(void) CU_ASSERT(odp_pool_destroy(pool) == 0); } +static void pool_test_info_data_range(void) +{ + odp_pool_t pool; + odp_pool_info_t info; + odp_pool_param_t param; + odp_packet_t pkt[PKT_NUM]; + uint32_t i, num; + uintptr_t pool_len; + + odp_pool_param_init(); + + param.type = ODP_POOL_PACKET; + param.pkt.num = PKT_NUM; + param.pkt.len = PKT_LEN; + + pool = odp_pool_create(NULL, ); + CU_ASSERT_FATAL(pool != ODP_POOL_INVALID); + + CU_ASSERT_FATAL(odp_pool_info(pool, ) == 0); + + pool_len = info.max_data_addr - info.min_data_addr + 1; + CU_ASSERT(pool_len >= PKT_NUM * PKT_LEN); + + num = 0; + + for (i = 0; i < PKT_NUM; i++) { + pkt[num] = odp_packet_alloc(pool, PKT_LEN); + CU_ASSERT(pkt[num] != ODP_PACKET_INVALID); + + if (pkt[num] != ODP_PACKET_INVALID) + num++; + } + + for (i = 0; i < num; i++) { + uintptr_t pkt_data, pkt_data_end; + uint32_t offset = 0, seg_len; + uint32_t pkt_len = odp_packet_len(pkt[i]); + + while (offset < pkt_len) { + pkt_data = (uintptr_t)odp_packet_offset(pkt[i], offset, + _len, NULL); + pkt_data_end = pkt_data + seg_len - 1; + CU_ASSERT((pkt_data >= info.min_data_addr) && + (pkt_data_end <= info.max_data_addr)); + offset += seg_len; + } + + odp_packet_free(pkt[i]); + } + + CU_ASSERT(odp_pool_destroy(pool) == 0); +} + odp_testinfo_t pool_suite[] = { ODP_TEST_INFO(pool_test_create_destroy_buffer), ODP_TEST_INFO(pool_test_create_destroy_packet), @@ -225,6 +278,7 @@ odp_testinfo_t pool_suite[] = { ODP_TEST_INFO(pool_test_alloc_packet_subparam), ODP_TEST_INFO(pool_test_info_packet), ODP_TEST_INFO(pool_test_lookup_info_print), + ODP_TEST_INFO(pool_test_info_data_range), ODP_TEST_INFO_NULL, };
[lng-odp] [PATCH v7 1/2] linux-generic: pool: Return address range in pool info
From: Michal MazurImplement support in odp_pool_info function to provide address range of pool data available to application. Pull request of related API change: https://github.com/Linaro/odp/pull/200 Signed-off-by: Michal Mazur --- /** Email created from pull request 495 (semihalf-mazur-michal:master) ** https://github.com/Linaro/odp/pull/495 ** Patch: https://github.com/Linaro/odp/pull/495.patch ** Base sha: 5a58bbf2bb331fd7dde2ebbc0430634ace6900fb ** Merge commit sha: 79aeba092a0c85e26786ff8efbaeb71608ae1fa3 **/ platform/linux-generic/odp_pool.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/platform/linux-generic/odp_pool.c b/platform/linux-generic/odp_pool.c index e5ba8982a..03578135c 100644 --- a/platform/linux-generic/odp_pool.c +++ b/platform/linux-generic/odp_pool.c @@ -693,6 +693,9 @@ int odp_pool_info(odp_pool_t pool_hdl, odp_pool_info_t *info) if (pool->params.type == ODP_POOL_PACKET) info->pkt.max_num = pool->num; + info->min_data_addr = (uintptr_t)pool->base_addr; + info->max_data_addr = (uintptr_t)pool->base_addr + pool->shm_size - 1; + return 0; }
[lng-odp] [PATCH v7 0/2] linux-generic: pool: Return address range in pool info
Implement support in odp_pool_info function to provide address range of pool data available to application. Similar change was already merged to caterpillar/linux-dpdk: #400 Pull request of related API change: #200 github /** Email created from pull request 495 (semihalf-mazur-michal:master) ** https://github.com/Linaro/odp/pull/495 ** Patch: https://github.com/Linaro/odp/pull/495.patch ** Base sha: 5a58bbf2bb331fd7dde2ebbc0430634ace6900fb ** Merge commit sha: 79aeba092a0c85e26786ff8efbaeb71608ae1fa3 **/ /github checkpatch.pl total: 0 errors, 0 warnings, 0 checks, 9 lines checked to_send-p-000.patch has no obvious style problems and is ready for submission. total: 0 errors, 0 warnings, 0 checks, 66 lines checked to_send-p-001.patch has no obvious style problems and is ready for submission. /checkpatch.pl
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/include/odp_ring_st_internal.h line 78 @@ -0,0 +1,118 @@ +/* Copyright (c) 2018, Linaro Limited + * All rights reserved. + * + * SPDX-License-Identifier: BSD-3-Clause + */ + +#ifndef ODP_RING_ST_INTERNAL_H_ +#define ODP_RING_ST_INTERNAL_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +/* Basic ring for single thread usage. Operations must be synchronized by using + * locks (or other means), when multiple threads use the same ring. */ +typedef struct { + uint32_t head; + uint32_t tail; + uint32_t mask; + uint32_t *data; + +} ring_st_t; + +/* Initialize ring. Ring size must be a power of two. */ +static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t size) +{ + ring->head = 0; + ring->tail = 0; + ring->mask = size - 1; + ring->data = data; +} + +/* Dequeue data from the ring head. Max_num is smaller than ring size.*/ +static inline uint32_t ring_st_deq_multi(ring_st_t *ring, uint32_t data[], +uint32_t max_num) +{ + uint32_t head, tail, mask, idx; + uint32_t num, i; + + head = ring->head; + tail = ring->tail; + mask = ring->mask; + num = tail - head; + + /* Empty */ + if (num == 0) + return 0; + + if (num > max_num) + num = max_num; + + idx = head & mask; + + for (i = 0; i < num; i++) { + data[i] = ring->data[idx]; + idx = (idx + 1) & mask; + } + + ring->head = head + num; + + return num; +} + +/* Enqueue data into the ring tail. Num_data is smaller than ring size. */ +static inline uint32_t ring_st_enq_multi(ring_st_t *ring, const uint32_t data[], +uint32_t num_data) +{ + uint32_t head, tail, mask, size, idx; + uint32_t num, i; + + head = ring->head; + tail = ring->tail; + mask = ring->mask; + size = mask + 1; + num = size - (tail - head); Comment: Please see above discussion. The comment stands. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Not true. See the comment below where it's clear that `head` and `tail` are > free running and thus will eventually wrap as described above. If these were > `uint64_t` variables you could argue that the wraps would take centuries and > hence are nothing to worry about, but being `uint32_t` variables this should > be expected to happen regularly on heavily used queues. > > `abs()` adds no overhead since compilers treat it as an intrinsic. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Note that neither `head` nor `tail` get masked when doing enqueues or >> dequeues, so they will traverse the entire 32-bit range of the variable over >> time. While the ring itself will never hold more than `size` entries, `head` >> and `tail` will wrap around. When you index into the ring you do so via a >> mask (`idx = head & mask;`), but calculating the number of elements in the >> ring doesn't do this, which is why `abs()` is needed. >>> Petri Savolainen(psavol) wrote: >>> Plus this is done only once - in pool create phase. Petri Savolainen(psavol) wrote: No since ring size will be much smaller than 4 billion. > Petri Savolainen(psavol) wrote: > The point is that ring size will never be close to 4 billion entries. > E.g. currently tail is always max 4096 larger than head. Your example > above is based assumption of 4 billion entry ring. Overflow is avoided > when ring size if <2 billion, as 32 bit indexes can be still used to > calculate number of items correctly. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Agreed. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Agreed, this is good for now. Later we may wish to honor the >>> user-requested queue `size` parameter. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Agreed. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Correct, as noted earlier. I withdraw that comment. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Presumably the compiler will see the overlap and optimize away the >> redundancy, so I assume the performance impact will be nil here. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Since you're allowing `head` and `tail` to run over the entire >>> 32-bit range, you're correct that you can completely fill the ring. >>> I did miss that point. However, as shown above you still need this >>> to be: >>> >>> ``` >>> num = size - abs(tail - head); >>> ``` >>> To avoid problems at the 32-bit wrap boundary. Bill Fischofer(Bill-Fischofer-Linaro) wrote: You're computing in 32 bits not 8 bits, and your ring
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/include/odp_ring_st_internal.h line 46 @@ -0,0 +1,118 @@ +/* Copyright (c) 2018, Linaro Limited + * All rights reserved. + * + * SPDX-License-Identifier: BSD-3-Clause + */ + +#ifndef ODP_RING_ST_INTERNAL_H_ +#define ODP_RING_ST_INTERNAL_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +/* Basic ring for single thread usage. Operations must be synchronized by using + * locks (or other means), when multiple threads use the same ring. */ +typedef struct { + uint32_t head; + uint32_t tail; + uint32_t mask; + uint32_t *data; + +} ring_st_t; + +/* Initialize ring. Ring size must be a power of two. */ +static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t size) +{ + ring->head = 0; + ring->tail = 0; + ring->mask = size - 1; + ring->data = data; +} + +/* Dequeue data from the ring head. Max_num is smaller than ring size.*/ +static inline uint32_t ring_st_deq_multi(ring_st_t *ring, uint32_t data[], +uint32_t max_num) +{ + uint32_t head, tail, mask, idx; + uint32_t num, i; + + head = ring->head; + tail = ring->tail; + mask = ring->mask; + num = tail - head; Comment: Not true. See the comment below where it's clear that `head` and `tail` are free running and thus will eventually wrap as described above. If these were `uint64_t` variables you could argue that the wraps would take centuries and hence are nothing to worry about, but being `uint32_t` variables this should be expected to happen regularly on heavily used queues. `abs()` adds no overhead since compilers treat it as an intrinsic. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Note that neither `head` nor `tail` get masked when doing enqueues or > dequeues, so they will traverse the entire 32-bit range of the variable over > time. While the ring itself will never hold more than `size` entries, `head` > and `tail` will wrap around. When you index into the ring you do so via a > mask (`idx = head & mask;`), but calculating the number of elements in the > ring doesn't do this, which is why `abs()` is needed. >> Petri Savolainen(psavol) wrote: >> Plus this is done only once - in pool create phase. >>> Petri Savolainen(psavol) wrote: >>> No since ring size will be much smaller than 4 billion. Petri Savolainen(psavol) wrote: The point is that ring size will never be close to 4 billion entries. E.g. currently tail is always max 4096 larger than head. Your example above is based assumption of 4 billion entry ring. Overflow is avoided when ring size if <2 billion, as 32 bit indexes can be still used to calculate number of items correctly. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Agreed. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Agreed, this is good for now. Later we may wish to honor the >> user-requested queue `size` parameter. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Agreed. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Correct, as noted earlier. I withdraw that comment. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Presumably the compiler will see the overlap and optimize away the > redundancy, so I assume the performance impact will be nil here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Since you're allowing `head` and `tail` to run over the entire >> 32-bit range, you're correct that you can completely fill the ring. >> I did miss that point. However, as shown above you still need this >> to be: >> >> ``` >> num = size - abs(tail - head); >> ``` >> To avoid problems at the 32-bit wrap boundary. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> You're computing in 32 bits not 8 bits, and your ring size is less >>> than 2^32 elements. Consider the following test program: >>> ``` >>> #include >>> #include >>> #include >>> #include >>> >>> void main() { >>> uint32_t head[4] = {0, 1, 2, 3}; >>> uint32_t tail[4] = {0, 0x, 0xfffe, 0xfffd}; >>> uint32_t mask = 4095; >>> uint32_t result; >>> int i; >>> >>> for (i = 0; i < 4; i++) { >>> printf("head = %" PRIu32 " tail = %" PRIu32 ":\n", >>>head[i], tail[i]); >>> >>> result = tail[i] - head[i]; >>> printf("tail - head = %" PRIu32 "\n", result); >>> >>> result = (tail[i] - head[i]) & mask; >>> printf("(tail - head) & mask = %" PRIu32 "\n", result); >>> >>>
Re: [lng-odp] [PATCH API-NEXT v1] api: packet: data and segment length
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: include/odp/api/spec/packet.h line 39 @@ -401,30 +401,39 @@ uint32_t odp_packet_buf_len(odp_packet_t pkt); /** * Packet data pointer * - * Returns the current packet data pointer. When a packet is received - * from packet input, this points to the first byte of the received - * packet. Packet level offsets are calculated relative to this position. + * Returns pointer to the first byte of packet data. When packet is segmented, + * only a portion of packet data follows the pointer. When unsure, use e.g. + * odp_packet_seg_len() to check the data length following the pointer. Packet + * level offsets are calculated relative to this position. * - * User can adjust the data pointer with head_push/head_pull (does not modify - * segmentation) and add_data/rem_data calls (may modify segmentation). + * When a packet is received from packet input, this points to the first byte + * of the received packet. Pool configuration parameters may be used to ensure + * that the first packet segment contains all/most of the data relevant to the + * application. + * + * User can adjust the data pointer with e.g. push_head/pull_head (does not + * modify segmentation) and extend_head/trunc_head (may modify segmentation) + * calls. * * @param pkt Packet handle * * @return Pointer to the packet data * - * @see odp_packet_l2_ptr(), odp_packet_seg_len() + * @see odp_packet_seg_len(), odp_packet_push_head(), odp_packet_extend_head() */ void *odp_packet_data(odp_packet_t pkt); /** - * Packet segment data length + * Packet data length following the data pointer * - * Returns number of data bytes following the current data pointer - * (odp_packet_data()) location in the segment. + * Returns number of data bytes (in the segment) following the current data + * pointer position. When unsure, use this function to check how many bytes Comment: Segments are inherently implementation-dependent, which is why portable applications should avoid processing packets in terms of segments. Applications only need be aware that segments may exist, meaning that packets may not be contiguously addressable. This is why a `seg_len` is returned on the various routines that provide addressability to packets. `odp_packet_data()` is the exception, and this was done as an efficiency measure for applications looking at headers contained at the start of the packet that are known to be contiguous because of the `min_seg_len` pool specification. > Petri Savolainen(psavol) wrote: > Yes. I didn't want to introduce a new limitation to segment/reference > implementation here. This patch just tries to make it clear that packets may > be segmented. > > E.g. Bill's reference implementation resulted packets that had first segment > length 0 and data pointer pointed to the second segment. It passed all > validation tests. From application point of view, it does not matter much if > spec allows empty segments to be linked into packet, although it's not very > intuitive and should be avoided when possible. >> Balasubramanian Manoharan(bala-manoharan) wrote: >> In the entire documentation you have avoided the term "first segment" is it >> by choice? >> IMO we could refer this as first segment valid data bytes >>> Petri Savolainen(psavol) wrote: >>> seg_len / push_head / extend_head are mentioned above. Packet_offset is not >>> specialized for handling first N bytes of packet, so it's not directly >>> related to these ones. Petri Savolainen(psavol) wrote: packet_offset is for different purpose (access data on arbitrary offset), these calls are optimized for the common case (offset zero). Also odp_packet_seg_data_len(), the new data_seg_len(), odp_packet_l2_ptr(), odp_packet_l3_ptr() and odp_packet_l4_ptr() output seg len, but we don't need to list all possible ways to get it. It's enough that the reader understands that a packet may have segments and segment length is different thing than total packet length > Dmitry Eremin-Solenikov(lumag) wrote: > And here too, please. >> Dmitry Eremin-Solenikov(lumag) wrote: >> odp_packet_seg_len() **or odp_packet_offset()**, if you don't mind. https://github.com/Linaro/odp/pull/497#discussion_r170275071 updated_at 2018-02-23 15:06:52
[lng-odp] [PATCH v6 1/2] linux-generic: pool: Return address range in pool info
From: Michal MazurImplement support in odp_pool_info function to provide address range of pool data available to application. Pull request of related API change: https://github.com/Linaro/odp/pull/200 Signed-off-by: Michal Mazur --- /** Email created from pull request 495 (semihalf-mazur-michal:master) ** https://github.com/Linaro/odp/pull/495 ** Patch: https://github.com/Linaro/odp/pull/495.patch ** Base sha: 5a58bbf2bb331fd7dde2ebbc0430634ace6900fb ** Merge commit sha: 0390589df2dbf51cfd601bc4baf1b06b571653bb **/ platform/linux-generic/odp_pool.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/platform/linux-generic/odp_pool.c b/platform/linux-generic/odp_pool.c index e5ba8982a..03578135c 100644 --- a/platform/linux-generic/odp_pool.c +++ b/platform/linux-generic/odp_pool.c @@ -693,6 +693,9 @@ int odp_pool_info(odp_pool_t pool_hdl, odp_pool_info_t *info) if (pool->params.type == ODP_POOL_PACKET) info->pkt.max_num = pool->num; + info->min_data_addr = (uintptr_t)pool->base_addr; + info->max_data_addr = (uintptr_t)pool->base_addr + pool->shm_size - 1; + return 0; }
[lng-odp] [PATCH v6 2/2] validation: pool: verify pool data range
From: Michal MazurAllocate maximum number of packets from pool and verify that packet data are located inside range returned by odp_pool_info. Signed-off-by: Michal Mazur --- /** Email created from pull request 495 (semihalf-mazur-michal:master) ** https://github.com/Linaro/odp/pull/495 ** Patch: https://github.com/Linaro/odp/pull/495.patch ** Base sha: 5a58bbf2bb331fd7dde2ebbc0430634ace6900fb ** Merge commit sha: 0390589df2dbf51cfd601bc4baf1b06b571653bb **/ test/validation/api/pool/pool.c | 46 + 1 file changed, 46 insertions(+) diff --git a/test/validation/api/pool/pool.c b/test/validation/api/pool/pool.c index 34f973573..f6ebb6886 100644 --- a/test/validation/api/pool/pool.c +++ b/test/validation/api/pool/pool.c @@ -217,6 +217,51 @@ static void pool_test_info_packet(void) CU_ASSERT(odp_pool_destroy(pool) == 0); } +static void pool_test_info_data_range(void) +{ + odp_pool_t pool; + odp_pool_info_t info; + odp_pool_param_t param; + odp_packet_t pkt[PKT_NUM]; + uint32_t i, num, seg_len; + uintptr_t pkt_data, pool_len; + + odp_pool_param_init(); + + param.type = ODP_POOL_PACKET; + param.pkt.num = PKT_NUM; + param.pkt.len = PKT_LEN; + + pool = odp_pool_create(NULL, ); + CU_ASSERT_FATAL(pool != ODP_POOL_INVALID); + + CU_ASSERT_FATAL(odp_pool_info(pool, ) == 0); + + pool_len = info.max_data_addr - info.min_data_addr + 1; + CU_ASSERT(pool_len >= PKT_NUM * PKT_LEN); + + num = 0; + + for (i = 0; i < PKT_NUM; i++) { + pkt[num] = odp_packet_alloc(pool, PKT_LEN); + CU_ASSERT(pkt[num] != ODP_PACKET_INVALID); + + if (pkt[num] != ODP_PACKET_INVALID) + num++; + } + + for (i = 0; i < num; i++) { + pkt_data = (uintptr_t)odp_packet_data(pkt[i]); + seg_len = odp_packet_seg_len(pkt[i]); + CU_ASSERT((pkt_data >= info.min_data_addr) && + (pkt_data + seg_len - 1 <= info.max_data_addr)); + + odp_packet_free(pkt[i]); + } + + CU_ASSERT(odp_pool_destroy(pool) == 0); +} + odp_testinfo_t pool_suite[] = { ODP_TEST_INFO(pool_test_create_destroy_buffer), ODP_TEST_INFO(pool_test_create_destroy_packet), @@ -225,6 +270,7 @@ odp_testinfo_t pool_suite[] = { ODP_TEST_INFO(pool_test_alloc_packet_subparam), ODP_TEST_INFO(pool_test_info_packet), ODP_TEST_INFO(pool_test_lookup_info_print), + ODP_TEST_INFO(pool_test_info_data_range), ODP_TEST_INFO_NULL, };
[lng-odp] [PATCH v6 0/2] linux-generic: pool: Return address range in pool info
Implement support in odp_pool_info function to provide address range of pool data available to application. Similar change was already merged to caterpillar/linux-dpdk: #400 Pull request of related API change: #200 github /** Email created from pull request 495 (semihalf-mazur-michal:master) ** https://github.com/Linaro/odp/pull/495 ** Patch: https://github.com/Linaro/odp/pull/495.patch ** Base sha: 5a58bbf2bb331fd7dde2ebbc0430634ace6900fb ** Merge commit sha: 0390589df2dbf51cfd601bc4baf1b06b571653bb **/ /github checkpatch.pl total: 0 errors, 0 warnings, 0 checks, 9 lines checked to_send-p-000.patch has no obvious style problems and is ready for submission. total: 0 errors, 0 warnings, 0 checks, 58 lines checked to_send-p-001.patch has no obvious style problems and is ready for submission. /checkpatch.pl
Re: [lng-odp] [PATCH API-NEXT v1] api: packet: data and segment length
Petri Savolainen(psavol) replied on github web page: include/odp/api/spec/packet.h line 39 @@ -401,30 +401,39 @@ uint32_t odp_packet_buf_len(odp_packet_t pkt); /** * Packet data pointer * - * Returns the current packet data pointer. When a packet is received - * from packet input, this points to the first byte of the received - * packet. Packet level offsets are calculated relative to this position. + * Returns pointer to the first byte of packet data. When packet is segmented, + * only a portion of packet data follows the pointer. When unsure, use e.g. + * odp_packet_seg_len() to check the data length following the pointer. Packet + * level offsets are calculated relative to this position. * - * User can adjust the data pointer with head_push/head_pull (does not modify - * segmentation) and add_data/rem_data calls (may modify segmentation). + * When a packet is received from packet input, this points to the first byte + * of the received packet. Pool configuration parameters may be used to ensure + * that the first packet segment contains all/most of the data relevant to the + * application. + * + * User can adjust the data pointer with e.g. push_head/pull_head (does not + * modify segmentation) and extend_head/trunc_head (may modify segmentation) + * calls. * * @param pkt Packet handle * * @return Pointer to the packet data * - * @see odp_packet_l2_ptr(), odp_packet_seg_len() + * @see odp_packet_seg_len(), odp_packet_push_head(), odp_packet_extend_head() */ void *odp_packet_data(odp_packet_t pkt); /** - * Packet segment data length + * Packet data length following the data pointer * - * Returns number of data bytes following the current data pointer - * (odp_packet_data()) location in the segment. + * Returns number of data bytes (in the segment) following the current data + * pointer position. When unsure, use this function to check how many bytes Comment: Yes. I didn't want to introduce a new limitation to segment/reference implementation here. This patch just tries to make it clear that packets may be segmented. E.g. Bill's reference implementation resulted packets that had first segment length 0 and data pointer pointed to the second segment. It passed all validation tests. From application point of view, it does not matter much if spec allows empty segments to be linked into packet, although it's not very intuitive and should be avoided when possible. > Balasubramanian Manoharan(bala-manoharan) wrote: > In the entire documentation you have avoided the term "first segment" is it > by choice? > IMO we could refer this as first segment valid data bytes >> Petri Savolainen(psavol) wrote: >> seg_len / push_head / extend_head are mentioned above. Packet_offset is not >> specialized for handling first N bytes of packet, so it's not directly >> related to these ones. >>> Petri Savolainen(psavol) wrote: >>> packet_offset is for different purpose (access data on arbitrary offset), >>> these calls are optimized for the common case (offset zero). Also >>> odp_packet_seg_data_len(), the new data_seg_len(), odp_packet_l2_ptr(), >>> odp_packet_l3_ptr() and odp_packet_l4_ptr() output seg len, but we don't >>> need to list all possible ways to get it. It's enough that the reader >>> understands that a packet may have segments and segment length is different >>> thing than total packet length Dmitry Eremin-Solenikov(lumag) wrote: And here too, please. > Dmitry Eremin-Solenikov(lumag) wrote: > odp_packet_seg_len() **or odp_packet_offset()**, if you don't mind. https://github.com/Linaro/odp/pull/497#discussion_r170220664 updated_at 2018-02-23 10:45:55
Re: [lng-odp] [PATCH API-NEXT v1] api: packet: data and segment length
Balasubramanian Manoharan(bala-manoharan) replied on github web page: include/odp/api/spec/packet.h line 39 @@ -401,30 +401,39 @@ uint32_t odp_packet_buf_len(odp_packet_t pkt); /** * Packet data pointer * - * Returns the current packet data pointer. When a packet is received - * from packet input, this points to the first byte of the received - * packet. Packet level offsets are calculated relative to this position. + * Returns pointer to the first byte of packet data. When packet is segmented, + * only a portion of packet data follows the pointer. When unsure, use e.g. + * odp_packet_seg_len() to check the data length following the pointer. Packet + * level offsets are calculated relative to this position. * - * User can adjust the data pointer with head_push/head_pull (does not modify - * segmentation) and add_data/rem_data calls (may modify segmentation). + * When a packet is received from packet input, this points to the first byte + * of the received packet. Pool configuration parameters may be used to ensure + * that the first packet segment contains all/most of the data relevant to the + * application. + * + * User can adjust the data pointer with e.g. push_head/pull_head (does not + * modify segmentation) and extend_head/trunc_head (may modify segmentation) + * calls. * * @param pkt Packet handle * * @return Pointer to the packet data * - * @see odp_packet_l2_ptr(), odp_packet_seg_len() + * @see odp_packet_seg_len(), odp_packet_push_head(), odp_packet_extend_head() */ void *odp_packet_data(odp_packet_t pkt); /** - * Packet segment data length + * Packet data length following the data pointer * - * Returns number of data bytes following the current data pointer - * (odp_packet_data()) location in the segment. + * Returns number of data bytes (in the segment) following the current data + * pointer position. When unsure, use this function to check how many bytes Comment: In the entire documentation you have avoided the term "first segment" is it by choice? IMO we could refer this as first segment valid data bytes > Petri Savolainen(psavol) wrote: > seg_len / push_head / extend_head are mentioned above. Packet_offset is not > specialized for handling first N bytes of packet, so it's not directly > related to these ones. >> Petri Savolainen(psavol) wrote: >> packet_offset is for different purpose (access data on arbitrary offset), >> these calls are optimized for the common case (offset zero). Also >> odp_packet_seg_data_len(), the new data_seg_len(), odp_packet_l2_ptr(), >> odp_packet_l3_ptr() and odp_packet_l4_ptr() output seg len, but we don't >> need to list all possible ways to get it. It's enough that the reader >> understands that a packet may have segments and segment length is different >> thing than total packet length >>> Dmitry Eremin-Solenikov(lumag) wrote: >>> And here too, please. Dmitry Eremin-Solenikov(lumag) wrote: odp_packet_seg_len() **or odp_packet_offset()**, if you don't mind. https://github.com/Linaro/odp/pull/497#discussion_r170204670 updated_at 2018-02-23 09:36:37
Re: [lng-odp] [PATCH v5] linux-generic: pool: Return address range in pool info
Petri Savolainen(psavol) replied on github web page: test/validation/api/pool/pool.c line 40 @@ -217,6 +217,50 @@ static void pool_test_info_packet(void) CU_ASSERT(odp_pool_destroy(pool) == 0); } +static void pool_test_info_data_range(void) +{ + odp_pool_t pool; + odp_pool_info_t info; + odp_pool_param_t param; + odp_packet_t pkt[PKT_NUM]; + uint32_t i, num; + uintptr_t pkt_data, pool_len; + + odp_pool_param_init(); + + param.type = ODP_POOL_PACKET; + param.pkt.num = PKT_NUM; + param.pkt.len = PKT_LEN; + + pool = odp_pool_create(NULL, ); + CU_ASSERT_FATAL(pool != ODP_POOL_INVALID); + + CU_ASSERT_FATAL(odp_pool_info(pool, ) == 0); + + pool_len = info.max_data_addr - info.min_data_addr + 1; + CU_ASSERT(pool_len >= PKT_NUM * PKT_LEN); + + num = 0; + + for (i = 0; i < PKT_NUM; i++) { + pkt[num] = odp_packet_alloc(pool, PKT_LEN); + CU_ASSERT(pkt[num] != ODP_PACKET_INVALID); + + if (pkt[num] != ODP_PACKET_INVALID) + num++; + } + + for (i = 0; i < num; i++) { + pkt_data = (uintptr_t)odp_packet_data(pkt[i]); + CU_ASSERT((pkt_data >= info.min_data_addr) && + (pkt_data + PKT_LEN - 1 <= info.max_data_addr)); Comment: It wrong to assume that entire packet data follows data pointer. Use odp_packet_seg_len() instead of PKT_LEN here. > semihalf-mazur-michal wrote > Fixed in v4 >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> I'd make this `CU_ASSERT()` rather than `CU_ASSERT_FATAL()` so all >> discrepancies can be caught. `CU_ASSERT_FATAL()` is reserved for setup >> failures that invalidate the entire test (_e.g.,_ not being able to create >> the pool, `odp_pool_info()` reporting an error, etc.) https://github.com/Linaro/odp/pull/495#discussion_r170193779 updated_at 2018-02-23 08:43:57
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Petri Savolainen(psavol) replied on github web page: platform/linux-generic/odp_pool.c line 28 @@ -296,7 +282,9 @@ static void init_buffers(pool_t *pool) memset(buf_hdr, 0, (uintptr_t)data - (uintptr_t)buf_hdr); /* Initialize buffer metadata */ - buf_hdr->index = i; + buf_hdr->index.u32= 0; + buf_hdr->index.pool = pool->pool_idx; + buf_hdr->index.buffer = i; Comment: Plus this is done only once - in pool create phase. > Petri Savolainen(psavol) wrote: > No since ring size will be much smaller than 4 billion. >> Petri Savolainen(psavol) wrote: >> The point is that ring size will never be close to 4 billion entries. E.g. >> currently tail is always max 4096 larger than head. Your example above is >> based assumption of 4 billion entry ring. Overflow is avoided when ring size >> if <2 billion, as 32 bit indexes can be still used to calculate number of >> items correctly. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Agreed. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Agreed, this is good for now. Later we may wish to honor the user-requested queue `size` parameter. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Agreed. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Correct, as noted earlier. I withdraw that comment. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Presumably the compiler will see the overlap and optimize away the >>> redundancy, so I assume the performance impact will be nil here. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Since you're allowing `head` and `tail` to run over the entire 32-bit range, you're correct that you can completely fill the ring. I did miss that point. However, as shown above you still need this to be: ``` num = size - abs(tail - head); ``` To avoid problems at the 32-bit wrap boundary. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > You're computing in 32 bits not 8 bits, and your ring size is less > than 2^32 elements. Consider the following test program: > ``` > #include > #include > #include > #include > > void main() { > uint32_t head[4] = {0, 1, 2, 3}; > uint32_t tail[4] = {0, 0x, 0xfffe, 0xfffd}; > uint32_t mask = 4095; > uint32_t result; > int i; > > for (i = 0; i < 4; i++) { > printf("head = %" PRIu32 " tail = %" PRIu32 ":\n", > head[i], tail[i]); > > result = tail[i] - head[i]; > printf("tail - head = %" PRIu32 "\n", result); > > result = (tail[i] - head[i]) & mask; > printf("(tail - head) & mask = %" PRIu32 "\n", result); > > result = abs(tail[i] - head[i]); > printf("abs(tail - head) = %" PRIu32 "\n\n", result); > } > } > ``` > in theory `tail - head` should be the number of elements in the ring, > in this case 0, 2, 4, and 6. But running this test program gives the > following output: > ``` > head = 0 tail = 0: > tail - head = 0 > (tail - head) & mask = 0 > abs(tail - head) = 0 > > head = 1 tail = 4294967295: > tail - head = 4294967294 > (tail - head) & mask = 4094 > abs(tail - head) = 2 > > head = 2 tail = 4294967294: > tail - head = 4294967292 > (tail - head) & mask = 4092 > abs(tail - head) = 4 > > head = 3 tail = 4294967293: > tail - head = 4294967290 > (tail - head) & mask = 4090 > abs(tail - head) = 6 > ``` > Since you're allowing head to run free over the 32-bit range of the > variable, when the 32-bits rolls over you'll get a large positive > number, not the small one you need to stay within the ring bounds. > The alternative is to mask `head` and `tail` as you increment them, > but then you run into the effective range issue. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> OK >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> OK, the `ODP_ASSERT()` would still be useful for debugging. Bill Fischofer(Bill-Fischofer-Linaro) wrote: That functionality is not obvious from the name. It either implies that one of the input arguments is written (not true here) or the reader might assume that it is an expression without side-effect and should be deleted (what I originally thought when reading it). You
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Petri Savolainen(psavol) replied on github web page: platform/linux-generic/include/odp_ring_st_internal.h line 78 @@ -0,0 +1,118 @@ +/* Copyright (c) 2018, Linaro Limited + * All rights reserved. + * + * SPDX-License-Identifier: BSD-3-Clause + */ + +#ifndef ODP_RING_ST_INTERNAL_H_ +#define ODP_RING_ST_INTERNAL_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +/* Basic ring for single thread usage. Operations must be synchronized by using + * locks (or other means), when multiple threads use the same ring. */ +typedef struct { + uint32_t head; + uint32_t tail; + uint32_t mask; + uint32_t *data; + +} ring_st_t; + +/* Initialize ring. Ring size must be a power of two. */ +static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t size) +{ + ring->head = 0; + ring->tail = 0; + ring->mask = size - 1; + ring->data = data; +} + +/* Dequeue data from the ring head. Max_num is smaller than ring size.*/ +static inline uint32_t ring_st_deq_multi(ring_st_t *ring, uint32_t data[], +uint32_t max_num) +{ + uint32_t head, tail, mask, idx; + uint32_t num, i; + + head = ring->head; + tail = ring->tail; + mask = ring->mask; + num = tail - head; + + /* Empty */ + if (num == 0) + return 0; + + if (num > max_num) + num = max_num; + + idx = head & mask; + + for (i = 0; i < num; i++) { + data[i] = ring->data[idx]; + idx = (idx + 1) & mask; + } + + ring->head = head + num; + + return num; +} + +/* Enqueue data into the ring tail. Num_data is smaller than ring size. */ +static inline uint32_t ring_st_enq_multi(ring_st_t *ring, const uint32_t data[], +uint32_t num_data) +{ + uint32_t head, tail, mask, size, idx; + uint32_t num, i; + + head = ring->head; + tail = ring->tail; + mask = ring->mask; + size = mask + 1; + num = size - (tail - head); Comment: No since ring size will be much smaller than 4 billion. > Petri Savolainen(psavol) wrote: > The point is that ring size will never be close to 4 billion entries. E.g. > currently tail is always max 4096 larger than head. Your example above is > based assumption of 4 billion entry ring. Overflow is avoided when ring size > if <2 billion, as 32 bit indexes can be still used to calculate number of > items correctly. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Agreed. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Agreed, this is good for now. Later we may wish to honor the user-requested >>> queue `size` parameter. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Agreed. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Correct, as noted earlier. I withdraw that comment. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Presumably the compiler will see the overlap and optimize away the >> redundancy, so I assume the performance impact will be nil here. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Since you're allowing `head` and `tail` to run over the entire 32-bit >>> range, you're correct that you can completely fill the ring. I did miss >>> that point. However, as shown above you still need this to be: >>> >>> ``` >>> num = size - abs(tail - head); >>> ``` >>> To avoid problems at the 32-bit wrap boundary. Bill Fischofer(Bill-Fischofer-Linaro) wrote: You're computing in 32 bits not 8 bits, and your ring size is less than 2^32 elements. Consider the following test program: ``` #include #include #include #include void main() { uint32_t head[4] = {0, 1, 2, 3}; uint32_t tail[4] = {0, 0x, 0xfffe, 0xfffd}; uint32_t mask = 4095; uint32_t result; int i; for (i = 0; i < 4; i++) { printf("head = %" PRIu32 " tail = %" PRIu32 ":\n", head[i], tail[i]); result = tail[i] - head[i]; printf("tail - head = %" PRIu32 "\n", result); result = (tail[i] - head[i]) & mask; printf("(tail - head) & mask = %" PRIu32 "\n", result); result = abs(tail[i] - head[i]); printf("abs(tail - head) = %" PRIu32 "\n\n", result); } } ``` in theory `tail - head` should be the number of elements in the ring, in this case 0, 2, 4, and 6. But running this test program gives the following output: ``` head = 0 tail = 0: tail - head = 0
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Petri Savolainen(psavol) replied on github web page: platform/linux-generic/include/odp_ring_st_internal.h line 46 @@ -0,0 +1,118 @@ +/* Copyright (c) 2018, Linaro Limited + * All rights reserved. + * + * SPDX-License-Identifier: BSD-3-Clause + */ + +#ifndef ODP_RING_ST_INTERNAL_H_ +#define ODP_RING_ST_INTERNAL_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +/* Basic ring for single thread usage. Operations must be synchronized by using + * locks (or other means), when multiple threads use the same ring. */ +typedef struct { + uint32_t head; + uint32_t tail; + uint32_t mask; + uint32_t *data; + +} ring_st_t; + +/* Initialize ring. Ring size must be a power of two. */ +static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t size) +{ + ring->head = 0; + ring->tail = 0; + ring->mask = size - 1; + ring->data = data; +} + +/* Dequeue data from the ring head. Max_num is smaller than ring size.*/ +static inline uint32_t ring_st_deq_multi(ring_st_t *ring, uint32_t data[], +uint32_t max_num) +{ + uint32_t head, tail, mask, idx; + uint32_t num, i; + + head = ring->head; + tail = ring->tail; + mask = ring->mask; + num = tail - head; Comment: The point is that ring size will never be close to 4 billion entries. E.g. currently tail is always max 4096 larger than head. Your example above is based assumption of 4 billion entry ring. Overflow is avoided when ring size if <2 billion, as 32 bit indexes can be still used to calculate number of items correctly. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Agreed. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Agreed, this is good for now. Later we may wish to honor the user-requested >> queue `size` parameter. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Agreed. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Correct, as noted earlier. I withdraw that comment. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Presumably the compiler will see the overlap and optimize away the > redundancy, so I assume the performance impact will be nil here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Since you're allowing `head` and `tail` to run over the entire 32-bit >> range, you're correct that you can completely fill the ring. I did miss >> that point. However, as shown above you still need this to be: >> >> ``` >> num = size - abs(tail - head); >> ``` >> To avoid problems at the 32-bit wrap boundary. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> You're computing in 32 bits not 8 bits, and your ring size is less than >>> 2^32 elements. Consider the following test program: >>> ``` >>> #include >>> #include >>> #include >>> #include >>> >>> void main() { >>> uint32_t head[4] = {0, 1, 2, 3}; >>> uint32_t tail[4] = {0, 0x, 0xfffe, 0xfffd}; >>> uint32_t mask = 4095; >>> uint32_t result; >>> int i; >>> >>> for (i = 0; i < 4; i++) { >>> printf("head = %" PRIu32 " tail = %" PRIu32 ":\n", >>>head[i], tail[i]); >>> >>> result = tail[i] - head[i]; >>> printf("tail - head = %" PRIu32 "\n", result); >>> >>> result = (tail[i] - head[i]) & mask; >>> printf("(tail - head) & mask = %" PRIu32 "\n", result); >>> >>> result = abs(tail[i] - head[i]); >>> printf("abs(tail - head) = %" PRIu32 "\n\n", result); >>> } >>> } >>> ``` >>> in theory `tail - head` should be the number of elements in the ring, >>> in this case 0, 2, 4, and 6. But running this test program gives the >>> following output: >>> ``` >>> head = 0 tail = 0: >>> tail - head = 0 >>> (tail - head) & mask = 0 >>> abs(tail - head) = 0 >>> >>> head = 1 tail = 4294967295: >>> tail - head = 4294967294 >>> (tail - head) & mask = 4094 >>> abs(tail - head) = 2 >>> >>> head = 2 tail = 4294967294: >>> tail - head = 4294967292 >>> (tail - head) & mask = 4092 >>> abs(tail - head) = 4 >>> >>> head = 3 tail = 4294967293: >>> tail - head = 4294967290 >>> (tail - head) & mask = 4090 >>> abs(tail - head) = 6 >>> ``` >>> Since you're allowing head to run free over the 32-bit range of the >>> variable, when the 32-bits rolls over you'll get a large positive >>> number, not the small one you need to stay within the ring bounds. The >>> alternative is to mask `head` and `tail` as you increment them, but >>> then you run into the effective range issue. Bill Fischofer(Bill-Fischofer-Linaro) wrote: OK >
Re: [lng-odp] [PATCH API-NEXT v1] api: packet: data and segment length
Petri Savolainen(psavol) replied on github web page: include/odp/api/spec/packet.h line 28 @@ -401,30 +401,39 @@ uint32_t odp_packet_buf_len(odp_packet_t pkt); /** * Packet data pointer * - * Returns the current packet data pointer. When a packet is received - * from packet input, this points to the first byte of the received - * packet. Packet level offsets are calculated relative to this position. + * Returns pointer to the first byte of packet data. When packet is segmented, + * only a portion of packet data follows the pointer. When unsure, use e.g. + * odp_packet_seg_len() to check the data length following the pointer. Packet + * level offsets are calculated relative to this position. * - * User can adjust the data pointer with head_push/head_pull (does not modify - * segmentation) and add_data/rem_data calls (may modify segmentation). + * When a packet is received from packet input, this points to the first byte + * of the received packet. Pool configuration parameters may be used to ensure + * that the first packet segment contains all/most of the data relevant to the + * application. + * + * User can adjust the data pointer with e.g. push_head/pull_head (does not + * modify segmentation) and extend_head/trunc_head (may modify segmentation) + * calls. * * @param pkt Packet handle * * @return Pointer to the packet data * - * @see odp_packet_l2_ptr(), odp_packet_seg_len() + * @see odp_packet_seg_len(), odp_packet_push_head(), odp_packet_extend_head() Comment: seg_len / push_head / extend_head are mentioned above. Packet_offset is not specialized for handling first N bytes of packet, so it's not directly related to these ones. > Petri Savolainen(psavol) wrote: > packet_offset is for different purpose (access data on arbitrary offset), > these calls are optimized for the common case (offset zero). Also > odp_packet_seg_data_len(), the new data_seg_len(), odp_packet_l2_ptr(), > odp_packet_l3_ptr() and odp_packet_l4_ptr() output seg len, but we don't > need to list all possible ways to get it. It's enough that the reader > understands that a packet may have segments and segment length is different > thing than total packet length >> Dmitry Eremin-Solenikov(lumag) wrote: >> And here too, please. >>> Dmitry Eremin-Solenikov(lumag) wrote: >>> odp_packet_seg_len() **or odp_packet_offset()**, if you don't mind. https://github.com/Linaro/odp/pull/497#discussion_r170183024 updated_at 2018-02-23 07:36:58
Re: [lng-odp] [PATCH API-NEXT v1] api: packet: data and segment length
Petri Savolainen(psavol) replied on github web page: include/odp/api/spec/packet.h line 9 @@ -401,30 +401,39 @@ uint32_t odp_packet_buf_len(odp_packet_t pkt); /** * Packet data pointer * - * Returns the current packet data pointer. When a packet is received - * from packet input, this points to the first byte of the received - * packet. Packet level offsets are calculated relative to this position. + * Returns pointer to the first byte of packet data. When packet is segmented, + * only a portion of packet data follows the pointer. When unsure, use e.g. + * odp_packet_seg_len() to check the data length following the pointer. Packet Comment: packet_offset is for different purpose (access data on arbitrary offset), these calls are optimized for the common case (offset zero). Also odp_packet_seg_data_len(), the new data_seg_len(), odp_packet_l2_ptr(), odp_packet_l3_ptr() and odp_packet_l4_ptr() output seg len, but we don't need to list all possible ways to get it. It's enough that the reader understands that a packet may have segments and segment length is different thing than total packet length > Dmitry Eremin-Solenikov(lumag) wrote: > And here too, please. >> Dmitry Eremin-Solenikov(lumag) wrote: >> odp_packet_seg_len() **or odp_packet_offset()**, if you don't mind. https://github.com/Linaro/odp/pull/497#discussion_r170182482 updated_at 2018-02-23 07:33:19
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/odp_queue.c line 103 @@ -192,6 +201,9 @@ static odp_queue_t queue_create(const char *name, param = _param; } + if (param->size > CONFIG_QUEUE_SIZE) Comment: Agreed, this is good for now. Later we may wish to honor the user-requested queue `size` parameter. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Agreed. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Correct, as noted earlier. I withdraw that comment. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Presumably the compiler will see the overlap and optimize away the >>> redundancy, so I assume the performance impact will be nil here. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Since you're allowing `head` and `tail` to run over the entire 32-bit range, you're correct that you can completely fill the ring. I did miss that point. However, as shown above you still need this to be: ``` num = size - abs(tail - head); ``` To avoid problems at the 32-bit wrap boundary. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > You're computing in 32 bits not 8 bits, and your ring size is less than > 2^32 elements. Consider the following test program: > ``` > #include > #include > #include > #include > > void main() { > uint32_t head[4] = {0, 1, 2, 3}; > uint32_t tail[4] = {0, 0x, 0xfffe, 0xfffd}; > uint32_t mask = 4095; > uint32_t result; > int i; > > for (i = 0; i < 4; i++) { > printf("head = %" PRIu32 " tail = %" PRIu32 ":\n", > head[i], tail[i]); > > result = tail[i] - head[i]; > printf("tail - head = %" PRIu32 "\n", result); > > result = (tail[i] - head[i]) & mask; > printf("(tail - head) & mask = %" PRIu32 "\n", result); > > result = abs(tail[i] - head[i]); > printf("abs(tail - head) = %" PRIu32 "\n\n", result); > } > } > ``` > in theory `tail - head` should be the number of elements in the ring, in > this case 0, 2, 4, and 6. But running this test program gives the > following output: > ``` > head = 0 tail = 0: > tail - head = 0 > (tail - head) & mask = 0 > abs(tail - head) = 0 > > head = 1 tail = 4294967295: > tail - head = 4294967294 > (tail - head) & mask = 4094 > abs(tail - head) = 2 > > head = 2 tail = 4294967294: > tail - head = 4294967292 > (tail - head) & mask = 4092 > abs(tail - head) = 4 > > head = 3 tail = 4294967293: > tail - head = 4294967290 > (tail - head) & mask = 4090 > abs(tail - head) = 6 > ``` > Since you're allowing head to run free over the 32-bit range of the > variable, when the 32-bits rolls over you'll get a large positive number, > not the small one you need to stay within the ring bounds. The > alternative is to mask `head` and `tail` as you increment them, but then > you run into the effective range issue. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> OK >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> OK, the `ODP_ASSERT()` would still be useful for debugging. Bill Fischofer(Bill-Fischofer-Linaro) wrote: That functionality is not obvious from the name. It either implies that one of the input arguments is written (not true here) or the reader might assume that it is an expression without side-effect and should be deleted (what I originally thought when reading it). You should pick a routine name that makes it clear it's actually doing something real, in this case performing prefetch processing. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Elsewhere you write `if (ring_st_is_empty(...))`, not `if > (ring_st_is_empty(...) == 1)` so this is inconsistent. >> Petri Savolainen(psavol) wrote: >> Didn't try larger than 32. 32 is already quite large from QoS point >> of view. >> >> I'm planning to use config file for run time tunning, so this hard >> coding may change in that phase. >>> Petri Savolainen(psavol) wrote: >>> One entry is not lost. Petri Savolainen(psavol) wrote: One entry is not lost. User provided size if not (currently) used. Queue size is always 4k. > Petri Savolainen(psavol) wrote: > One entry is not lost. >> Petri Savolainen(psavol) wrote: >> One entry is not lost. >>> Petri Savolainen(psavol) wrote: >>> OK, added checks in v2. Petri Savolainen(psavol) wrote: OK. Compiler probably did that already, but changed
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/odp_queue.c line 95 @@ -143,8 +150,10 @@ static int queue_capability(odp_queue_capability_t *capa) capa->max_sched_groups = sched_fn->num_grps(); capa->sched_prios = odp_schedule_num_prio(); capa->plain.max_num = capa->max_queues; + capa->plain.max_size= CONFIG_QUEUE_SIZE; capa->plain.nonblocking = ODP_BLOCKING; capa->sched.max_num = capa->max_queues; + capa->sched.max_size= CONFIG_QUEUE_SIZE; Comment: Agreed. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Correct, as noted earlier. I withdraw that comment. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Presumably the compiler will see the overlap and optimize away the >> redundancy, so I assume the performance impact will be nil here. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Since you're allowing `head` and `tail` to run over the entire 32-bit >>> range, you're correct that you can completely fill the ring. I did miss >>> that point. However, as shown above you still need this to be: >>> >>> ``` >>> num = size - abs(tail - head); >>> ``` >>> To avoid problems at the 32-bit wrap boundary. Bill Fischofer(Bill-Fischofer-Linaro) wrote: You're computing in 32 bits not 8 bits, and your ring size is less than 2^32 elements. Consider the following test program: ``` #include #include #include #include void main() { uint32_t head[4] = {0, 1, 2, 3}; uint32_t tail[4] = {0, 0x, 0xfffe, 0xfffd}; uint32_t mask = 4095; uint32_t result; int i; for (i = 0; i < 4; i++) { printf("head = %" PRIu32 " tail = %" PRIu32 ":\n", head[i], tail[i]); result = tail[i] - head[i]; printf("tail - head = %" PRIu32 "\n", result); result = (tail[i] - head[i]) & mask; printf("(tail - head) & mask = %" PRIu32 "\n", result); result = abs(tail[i] - head[i]); printf("abs(tail - head) = %" PRIu32 "\n\n", result); } } ``` in theory `tail - head` should be the number of elements in the ring, in this case 0, 2, 4, and 6. But running this test program gives the following output: ``` head = 0 tail = 0: tail - head = 0 (tail - head) & mask = 0 abs(tail - head) = 0 head = 1 tail = 4294967295: tail - head = 4294967294 (tail - head) & mask = 4094 abs(tail - head) = 2 head = 2 tail = 4294967294: tail - head = 4294967292 (tail - head) & mask = 4092 abs(tail - head) = 4 head = 3 tail = 4294967293: tail - head = 4294967290 (tail - head) & mask = 4090 abs(tail - head) = 6 ``` Since you're allowing head to run free over the 32-bit range of the variable, when the 32-bits rolls over you'll get a large positive number, not the small one you need to stay within the ring bounds. The alternative is to mask `head` and `tail` as you increment them, but then you run into the effective range issue. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > OK >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> OK, the `ODP_ASSERT()` would still be useful for debugging. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> That functionality is not obvious from the name. It either implies that >>> one of the input arguments is written (not true here) or the reader >>> might assume that it is an expression without side-effect and should be >>> deleted (what I originally thought when reading it). You should pick a >>> routine name that makes it clear it's actually doing something real, in >>> this case performing prefetch processing. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Elsewhere you write `if (ring_st_is_empty(...))`, not `if (ring_st_is_empty(...) == 1)` so this is inconsistent. > Petri Savolainen(psavol) wrote: > Didn't try larger than 32. 32 is already quite large from QoS point > of view. > > I'm planning to use config file for run time tunning, so this hard > coding may change in that phase. >> Petri Savolainen(psavol) wrote: >> One entry is not lost. >>> Petri Savolainen(psavol) wrote: >>> One entry is not lost. User provided size if not (currently) used. >>> Queue size is always 4k. Petri Savolainen(psavol) wrote: One entry is not lost. > Petri Savolainen(psavol) wrote: > One entry is not lost. >> Petri Savolainen(psavol) wrote: >> OK, added checks in v2. >>> Petri Savolainen(psavol) wrote: >>> OK.
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/odp_queue.c line 92 @@ -143,8 +150,10 @@ static int queue_capability(odp_queue_capability_t *capa) capa->max_sched_groups = sched_fn->num_grps(); capa->sched_prios = odp_schedule_num_prio(); capa->plain.max_num = capa->max_queues; + capa->plain.max_size= CONFIG_QUEUE_SIZE; Comment: Correct, as noted earlier. I withdraw that comment. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Presumably the compiler will see the overlap and optimize away the > redundancy, so I assume the performance impact will be nil here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Since you're allowing `head` and `tail` to run over the entire 32-bit range, >> you're correct that you can completely fill the ring. I did miss that point. >> However, as shown above you still need this to be: >> >> ``` >> num = size - abs(tail - head); >> ``` >> To avoid problems at the 32-bit wrap boundary. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> You're computing in 32 bits not 8 bits, and your ring size is less than >>> 2^32 elements. Consider the following test program: >>> ``` >>> #include >>> #include >>> #include >>> #include >>> >>> void main() { >>> uint32_t head[4] = {0, 1, 2, 3}; >>> uint32_t tail[4] = {0, 0x, 0xfffe, 0xfffd}; >>> uint32_t mask = 4095; >>> uint32_t result; >>> int i; >>> >>> for (i = 0; i < 4; i++) { >>> printf("head = %" PRIu32 " tail = %" PRIu32 ":\n", >>>head[i], tail[i]); >>> >>> result = tail[i] - head[i]; >>> printf("tail - head = %" PRIu32 "\n", result); >>> >>> result = (tail[i] - head[i]) & mask; >>> printf("(tail - head) & mask = %" PRIu32 "\n", result); >>> >>> result = abs(tail[i] - head[i]); >>> printf("abs(tail - head) = %" PRIu32 "\n\n", result); >>> } >>> } >>> ``` >>> in theory `tail - head` should be the number of elements in the ring, in >>> this case 0, 2, 4, and 6. But running this test program gives the following >>> output: >>> ``` >>> head = 0 tail = 0: >>> tail - head = 0 >>> (tail - head) & mask = 0 >>> abs(tail - head) = 0 >>> >>> head = 1 tail = 4294967295: >>> tail - head = 4294967294 >>> (tail - head) & mask = 4094 >>> abs(tail - head) = 2 >>> >>> head = 2 tail = 4294967294: >>> tail - head = 4294967292 >>> (tail - head) & mask = 4092 >>> abs(tail - head) = 4 >>> >>> head = 3 tail = 4294967293: >>> tail - head = 4294967290 >>> (tail - head) & mask = 4090 >>> abs(tail - head) = 6 >>> ``` >>> Since you're allowing head to run free over the 32-bit range of the >>> variable, when the 32-bits rolls over you'll get a large positive number, >>> not the small one you need to stay within the ring bounds. The alternative >>> is to mask `head` and `tail` as you increment them, but then you run into >>> the effective range issue. Bill Fischofer(Bill-Fischofer-Linaro) wrote: OK > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > OK, the `ODP_ASSERT()` would still be useful for debugging. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> That functionality is not obvious from the name. It either implies that >> one of the input arguments is written (not true here) or the reader >> might assume that it is an expression without side-effect and should be >> deleted (what I originally thought when reading it). You should pick a >> routine name that makes it clear it's actually doing something real, in >> this case performing prefetch processing. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Elsewhere you write `if (ring_st_is_empty(...))`, not `if >>> (ring_st_is_empty(...) == 1)` so this is inconsistent. Petri Savolainen(psavol) wrote: Didn't try larger than 32. 32 is already quite large from QoS point of view. I'm planning to use config file for run time tunning, so this hard coding may change in that phase. > Petri Savolainen(psavol) wrote: > One entry is not lost. >> Petri Savolainen(psavol) wrote: >> One entry is not lost. User provided size if not (currently) used. >> Queue size is always 4k. >>> Petri Savolainen(psavol) wrote: >>> One entry is not lost. Petri Savolainen(psavol) wrote: One entry is not lost. > Petri Savolainen(psavol) wrote: > OK, added checks in v2. >> Petri Savolainen(psavol) wrote: >> OK. Compiler probably did that already, but changed in v2. >>> Petri Savolainen(psavol) wrote: >>> Tail and head indexes are (masked from) uint32_t and do not >>> wrap around when the ring is full. I think you assume that the >>> store index is
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/odp_pool.c line 28 @@ -296,7 +282,9 @@ static void init_buffers(pool_t *pool) memset(buf_hdr, 0, (uintptr_t)data - (uintptr_t)buf_hdr); /* Initialize buffer metadata */ - buf_hdr->index = i; + buf_hdr->index.u32= 0; + buf_hdr->index.pool = pool->pool_idx; + buf_hdr->index.buffer = i; Comment: Presumably the compiler will see the overlap and optimize away the redundancy, so I assume the performance impact will be nil here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Since you're allowing `head` and `tail` to run over the entire 32-bit range, > you're correct that you can completely fill the ring. I did miss that point. > However, as shown above you still need this to be: > > ``` > num = size - abs(tail - head); > ``` > To avoid problems at the 32-bit wrap boundary. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> You're computing in 32 bits not 8 bits, and your ring size is less than 2^32 >> elements. Consider the following test program: >> ``` >> #include >> #include >> #include >> #include >> >> void main() { >> uint32_t head[4] = {0, 1, 2, 3}; >> uint32_t tail[4] = {0, 0x, 0xfffe, 0xfffd}; >> uint32_t mask = 4095; >> uint32_t result; >> int i; >> >> for (i = 0; i < 4; i++) { >> printf("head = %" PRIu32 " tail = %" PRIu32 ":\n", >> head[i], tail[i]); >> >> result = tail[i] - head[i]; >> printf("tail - head = %" PRIu32 "\n", result); >> >> result = (tail[i] - head[i]) & mask; >> printf("(tail - head) & mask = %" PRIu32 "\n", result); >> >> result = abs(tail[i] - head[i]); >> printf("abs(tail - head) = %" PRIu32 "\n\n", result); >> } >> } >> ``` >> in theory `tail - head` should be the number of elements in the ring, in >> this case 0, 2, 4, and 6. But running this test program gives the following >> output: >> ``` >> head = 0 tail = 0: >> tail - head = 0 >> (tail - head) & mask = 0 >> abs(tail - head) = 0 >> >> head = 1 tail = 4294967295: >> tail - head = 4294967294 >> (tail - head) & mask = 4094 >> abs(tail - head) = 2 >> >> head = 2 tail = 4294967294: >> tail - head = 4294967292 >> (tail - head) & mask = 4092 >> abs(tail - head) = 4 >> >> head = 3 tail = 4294967293: >> tail - head = 4294967290 >> (tail - head) & mask = 4090 >> abs(tail - head) = 6 >> ``` >> Since you're allowing head to run free over the 32-bit range of the >> variable, when the 32-bits rolls over you'll get a large positive number, >> not the small one you need to stay within the ring bounds. The alternative >> is to mask `head` and `tail` as you increment them, but then you run into >> the effective range issue. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> OK Bill Fischofer(Bill-Fischofer-Linaro) wrote: OK, the `ODP_ASSERT()` would still be useful for debugging. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > That functionality is not obvious from the name. It either implies that > one of the input arguments is written (not true here) or the reader might > assume that it is an expression without side-effect and should be deleted > (what I originally thought when reading it). You should pick a routine > name that makes it clear it's actually doing something real, in this case > performing prefetch processing. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Elsewhere you write `if (ring_st_is_empty(...))`, not `if >> (ring_st_is_empty(...) == 1)` so this is inconsistent. >>> Petri Savolainen(psavol) wrote: >>> Didn't try larger than 32. 32 is already quite large from QoS point of >>> view. >>> >>> I'm planning to use config file for run time tunning, so this hard >>> coding may change in that phase. Petri Savolainen(psavol) wrote: One entry is not lost. > Petri Savolainen(psavol) wrote: > One entry is not lost. User provided size if not (currently) used. > Queue size is always 4k. >> Petri Savolainen(psavol) wrote: >> One entry is not lost. >>> Petri Savolainen(psavol) wrote: >>> One entry is not lost. Petri Savolainen(psavol) wrote: OK, added checks in v2. > Petri Savolainen(psavol) wrote: > OK. Compiler probably did that already, but changed in v2. >> Petri Savolainen(psavol) wrote: >> Tail and head indexes are (masked from) uint32_t and do not wrap >> around when the ring is full. I think you assume that the store >> index is 0...size-1, while it's full uint32_t which is then >> masked to get the actual index. >> >> For example:
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/odp_queue.c line 420 @@ -584,8 +556,9 @@ static int queue_init(queue_entry_t *queue, const char *name, queue->s.pktin = PKTIN_INVALID; queue->s.pktout = PKTOUT_INVALID; - queue->s.head = NULL; - queue->s.tail = NULL; + ring_st_init(>s.ring_st, +queue_tbl->ring_data[queue->s.index].data, +CONFIG_QUEUE_SIZE); Comment: Agreed. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Agreed, this is good for now. Later we may wish to honor the user-requested > queue `size` parameter. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Agreed. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Correct, as noted earlier. I withdraw that comment. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Presumably the compiler will see the overlap and optimize away the redundancy, so I assume the performance impact will be nil here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Since you're allowing `head` and `tail` to run over the entire 32-bit > range, you're correct that you can completely fill the ring. I did miss > that point. However, as shown above you still need this to be: > > ``` > num = size - abs(tail - head); > ``` > To avoid problems at the 32-bit wrap boundary. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> You're computing in 32 bits not 8 bits, and your ring size is less than >> 2^32 elements. Consider the following test program: >> ``` >> #include >> #include >> #include >> #include >> >> void main() { >> uint32_t head[4] = {0, 1, 2, 3}; >> uint32_t tail[4] = {0, 0x, 0xfffe, 0xfffd}; >> uint32_t mask = 4095; >> uint32_t result; >> int i; >> >> for (i = 0; i < 4; i++) { >> printf("head = %" PRIu32 " tail = %" PRIu32 ":\n", >> head[i], tail[i]); >> >> result = tail[i] - head[i]; >> printf("tail - head = %" PRIu32 "\n", result); >> >> result = (tail[i] - head[i]) & mask; >> printf("(tail - head) & mask = %" PRIu32 "\n", result); >> >> result = abs(tail[i] - head[i]); >> printf("abs(tail - head) = %" PRIu32 "\n\n", result); >> } >> } >> ``` >> in theory `tail - head` should be the number of elements in the ring, in >> this case 0, 2, 4, and 6. But running this test program gives the >> following output: >> ``` >> head = 0 tail = 0: >> tail - head = 0 >> (tail - head) & mask = 0 >> abs(tail - head) = 0 >> >> head = 1 tail = 4294967295: >> tail - head = 4294967294 >> (tail - head) & mask = 4094 >> abs(tail - head) = 2 >> >> head = 2 tail = 4294967294: >> tail - head = 4294967292 >> (tail - head) & mask = 4092 >> abs(tail - head) = 4 >> >> head = 3 tail = 4294967293: >> tail - head = 4294967290 >> (tail - head) & mask = 4090 >> abs(tail - head) = 6 >> ``` >> Since you're allowing head to run free over the 32-bit range of the >> variable, when the 32-bits rolls over you'll get a large positive >> number, not the small one you need to stay within the ring bounds. The >> alternative is to mask `head` and `tail` as you increment them, but then >> you run into the effective range issue. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> OK Bill Fischofer(Bill-Fischofer-Linaro) wrote: OK, the `ODP_ASSERT()` would still be useful for debugging. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > That functionality is not obvious from the name. It either implies > that one of the input arguments is written (not true here) or the > reader might assume that it is an expression without side-effect and > should be deleted (what I originally thought when reading it). You > should pick a routine name that makes it clear it's actually doing > something real, in this case performing prefetch processing. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Elsewhere you write `if (ring_st_is_empty(...))`, not `if >> (ring_st_is_empty(...) == 1)` so this is inconsistent. >>> Petri Savolainen(psavol) wrote: >>> Didn't try larger than 32. 32 is already quite large from QoS point >>> of view. >>> >>> I'm planning to use config file for run time tunning, so this hard >>> coding may change in that phase. Petri Savolainen(psavol) wrote: One entry is not lost. > Petri Savolainen(psavol) wrote: > One entry is not lost. User provided size if not (currently) > used. Queue size is always 4k. >> Petri
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/include/odp_ring_st_internal.h line 78 @@ -0,0 +1,118 @@ +/* Copyright (c) 2018, Linaro Limited + * All rights reserved. + * + * SPDX-License-Identifier: BSD-3-Clause + */ + +#ifndef ODP_RING_ST_INTERNAL_H_ +#define ODP_RING_ST_INTERNAL_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +/* Basic ring for single thread usage. Operations must be synchronized by using + * locks (or other means), when multiple threads use the same ring. */ +typedef struct { + uint32_t head; + uint32_t tail; + uint32_t mask; + uint32_t *data; + +} ring_st_t; + +/* Initialize ring. Ring size must be a power of two. */ +static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t size) +{ + ring->head = 0; + ring->tail = 0; + ring->mask = size - 1; + ring->data = data; +} + +/* Dequeue data from the ring head. Max_num is smaller than ring size.*/ +static inline uint32_t ring_st_deq_multi(ring_st_t *ring, uint32_t data[], +uint32_t max_num) +{ + uint32_t head, tail, mask, idx; + uint32_t num, i; + + head = ring->head; + tail = ring->tail; + mask = ring->mask; + num = tail - head; + + /* Empty */ + if (num == 0) + return 0; + + if (num > max_num) + num = max_num; + + idx = head & mask; + + for (i = 0; i < num; i++) { + data[i] = ring->data[idx]; + idx = (idx + 1) & mask; + } + + ring->head = head + num; + + return num; +} + +/* Enqueue data into the ring tail. Num_data is smaller than ring size. */ +static inline uint32_t ring_st_enq_multi(ring_st_t *ring, const uint32_t data[], +uint32_t num_data) +{ + uint32_t head, tail, mask, size, idx; + uint32_t num, i; + + head = ring->head; + tail = ring->tail; + mask = ring->mask; + size = mask + 1; + num = size - (tail - head); Comment: Since you're allowing `head` and `tail` to run over the entire 32-bit range, you're correct that you can completely fill the ring. I did miss that point. However, as shown above you still need this to be: ``` num = size - abs(tail - head); ``` To avoid problems at the 32-bit wrap boundary. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > You're computing in 32 bits not 8 bits, and your ring size is less than 2^32 > elements. Consider the following test program: > ``` > #include > #include > #include > #include > > void main() { > uint32_t head[4] = {0, 1, 2, 3}; > uint32_t tail[4] = {0, 0x, 0xfffe, 0xfffd}; > uint32_t mask = 4095; > uint32_t result; > int i; > > for (i = 0; i < 4; i++) { > printf("head = %" PRIu32 " tail = %" PRIu32 ":\n", > head[i], tail[i]); > > result = tail[i] - head[i]; > printf("tail - head = %" PRIu32 "\n", result); > > result = (tail[i] - head[i]) & mask; > printf("(tail - head) & mask = %" PRIu32 "\n", result); > > result = abs(tail[i] - head[i]); > printf("abs(tail - head) = %" PRIu32 "\n\n", result); > } > } > ``` > in theory `tail - head` should be the number of elements in the ring, in this > case 0, 2, 4, and 6. But running this test program gives the following output: > ``` > head = 0 tail = 0: > tail - head = 0 > (tail - head) & mask = 0 > abs(tail - head) = 0 > > head = 1 tail = 4294967295: > tail - head = 4294967294 > (tail - head) & mask = 4094 > abs(tail - head) = 2 > > head = 2 tail = 4294967294: > tail - head = 4294967292 > (tail - head) & mask = 4092 > abs(tail - head) = 4 > > head = 3 tail = 4294967293: > tail - head = 4294967290 > (tail - head) & mask = 4090 > abs(tail - head) = 6 > ``` > Since you're allowing head to run free over the 32-bit range of the variable, > when the 32-bits rolls over you'll get a large positive number, not the small > one you need to stay within the ring bounds. The alternative is to mask > `head` and `tail` as you increment them, but then you run into the effective > range issue. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> OK >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> OK, the `ODP_ASSERT()` would still be useful for debugging. Bill Fischofer(Bill-Fischofer-Linaro) wrote: That functionality is not obvious from the name. It either implies that one of the input arguments is written (not true here) or the reader might assume that it is an expression without side-effect and should be deleted (what I originally thought when reading it). You should pick a routine name that makes it clear it's actually doing something real, in this case performing prefetch processing. > Bill
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/include/odp_ring_st_internal.h line 46 @@ -0,0 +1,118 @@ +/* Copyright (c) 2018, Linaro Limited + * All rights reserved. + * + * SPDX-License-Identifier: BSD-3-Clause + */ + +#ifndef ODP_RING_ST_INTERNAL_H_ +#define ODP_RING_ST_INTERNAL_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +/* Basic ring for single thread usage. Operations must be synchronized by using + * locks (or other means), when multiple threads use the same ring. */ +typedef struct { + uint32_t head; + uint32_t tail; + uint32_t mask; + uint32_t *data; + +} ring_st_t; + +/* Initialize ring. Ring size must be a power of two. */ +static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t size) +{ + ring->head = 0; + ring->tail = 0; + ring->mask = size - 1; + ring->data = data; +} + +/* Dequeue data from the ring head. Max_num is smaller than ring size.*/ +static inline uint32_t ring_st_deq_multi(ring_st_t *ring, uint32_t data[], +uint32_t max_num) +{ + uint32_t head, tail, mask, idx; + uint32_t num, i; + + head = ring->head; + tail = ring->tail; + mask = ring->mask; + num = tail - head; Comment: You're computing in 32 bits not 8 bits, and your ring size is less than 2^32 elements. Consider the following test program: ``` #include #include #include #include void main() { uint32_t head[4] = {0, 1, 2, 3}; uint32_t tail[4] = {0, 0x, 0xfffe, 0xfffd}; uint32_t mask = 4095; uint32_t result; int i; for (i = 0; i < 4; i++) { printf("head = %" PRIu32 " tail = %" PRIu32 ":\n", head[i], tail[i]); result = tail[i] - head[i]; printf("tail - head = %" PRIu32 "\n", result); result = (tail[i] - head[i]) & mask; printf("(tail - head) & mask = %" PRIu32 "\n", result); result = abs(tail[i] - head[i]); printf("abs(tail - head) = %" PRIu32 "\n\n", result); } } ``` in theory `tail - head` should be the number of elements in the ring, in this case 0, 2, 4, and 6. But running this test program gives the following output: ``` head = 0 tail = 0: tail - head = 0 (tail - head) & mask = 0 abs(tail - head) = 0 head = 1 tail = 4294967295: tail - head = 4294967294 (tail - head) & mask = 4094 abs(tail - head) = 2 head = 2 tail = 4294967294: tail - head = 4294967292 (tail - head) & mask = 4092 abs(tail - head) = 4 head = 3 tail = 4294967293: tail - head = 4294967290 (tail - head) & mask = 4090 abs(tail - head) = 6 ``` Since you're allowing head to run free over the 32-bit range of the variable, when the 32-bits rolls over you'll get a large positive number, not the small one you need to stay within the ring bounds. The alternative is to mask `head` and `tail` as you increment them, but then you run into the effective range issue. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > OK >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> OK, the `ODP_ASSERT()` would still be useful for debugging. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> That functionality is not obvious from the name. It either implies that one >>> of the input arguments is written (not true here) or the reader might >>> assume that it is an expression without side-effect and should be deleted >>> (what I originally thought when reading it). You should pick a routine name >>> that makes it clear it's actually doing something real, in this case >>> performing prefetch processing. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Elsewhere you write `if (ring_st_is_empty(...))`, not `if (ring_st_is_empty(...) == 1)` so this is inconsistent. > Petri Savolainen(psavol) wrote: > Didn't try larger than 32. 32 is already quite large from QoS point of > view. > > I'm planning to use config file for run time tunning, so this hard coding > may change in that phase. >> Petri Savolainen(psavol) wrote: >> One entry is not lost. >>> Petri Savolainen(psavol) wrote: >>> One entry is not lost. User provided size if not (currently) used. >>> Queue size is always 4k. Petri Savolainen(psavol) wrote: One entry is not lost. > Petri Savolainen(psavol) wrote: > One entry is not lost. >> Petri Savolainen(psavol) wrote: >> OK, added checks in v2. >>> Petri Savolainen(psavol) wrote: >>> OK. Compiler probably did that already, but changed in v2. Petri Savolainen(psavol) wrote: Tail and head indexes are (masked from) uint32_t and do not wrap around when the ring is full. I think you assume that the store index is 0...size-1, while it's full uint32_t
Re: [lng-odp] [PATCH API-NEXT v2] IPsec TFC implementation
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: include/odp/api/spec/ipsec.h line 8 @@ -1346,9 +1346,7 @@ int odp_ipsec_in(const odp_packet_t pkt_in[], int num_in, * and content of packet data before the IP header is undefined. Use outbound * operation parameters to specify the amount of TFC padding appended to * the packet during IPSEC transformation. Options can be used also to create - * TFC dummy packets. Packet data content is ignored in tunnel mode TFC dummy - * packet creation as tfc_pad_len option defines solely the packet length. - * In all other cases, payload length for the IPSEC transformation is specified + * TFC dummy packets. Payload length for the IPSEC transformation is specified * by odp_packet_len() minus odp_packet_l3_offset() plus tfc_pad_len option. Comment: A dummy packet is defined only by the fact that it is a dummy and the requested length. No further metadata should be needed or required, so I'd be happy to see that requirement dropped. I view this as a building block for a TBD higher-level configuration service that would simply permit IPsec to be configured to provide a requested level of traffic masking and leave it up to the implementation to decide how best and most efficiently to achieve that via a combination of TFC dummy packets and padding. > Dmitry Eremin-Solenikov(lumag) wrote: > Well, it's just a magic number. Same as 0 would be. >> Dmitry Eremin-Solenikov(lumag) wrote: >> Hmm. We've spent several minutes on this, but nobody reminded of API >> convention. Should we change the spec here? @psavol what is your opinion? >>> Dmitry Eremin-Solenikov(lumag) wrote: >>> We require that an application sets `l3_offset` for all packet it pushes to >>> IPsec. For TFC dummy packets it resulted in `l3_offset` being set but >>> ignored. Thus I proposed this change. Other solution might be to stop >>> requiring `l3_offset` for TFC dummy packets. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Should `0xa5` be a `#define` rather than a "magic number"? > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Nit: could use `odp_unlikely()` here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> This change requires an API change as the spec says relevant offsets >> must be in the range `0..odp_packet_len(pkt) - 1` . Same comment for >> the L3 and L4 changes in this patch. >> >> In theory the validation tests should test these bounds, but as with >> most parts of the API violations simply result in undefined behavior, so >> this is an "honor system". Still, we can't violate the spec here without >> changing the spec. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Unless it has been parsed, `odp_packet_l3_offset()` is initialized to >>> `ODP_PACKET_OFFSET_INVALID`, so this seems an undue burden. The >>> original wording seems cleaner from an application perspective. https://github.com/Linaro/odp/pull/494#discussion_r170133591 updated_at 2018-02-23 00:13:09
Re: [lng-odp] [PATCH API-NEXT v2] IPsec TFC implementation
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/odp_ipsec.c line 245 @@ -1165,6 +1167,8 @@ static int ipsec_out_esp(odp_packet_t *pkt, ipsec_offset + _ODP_ESPHDR_LEN, ipsec_sa->esp_iv_len, state->iv + ipsec_sa->salt_length); + _odp_packet_set_data(*pkt, esptrl_offset - esptrl.pad_len - tfc_len, +0xa5, tfc_len); Comment: True, but non-zero numbers tend to stand out when used like this. Why pick this number vs. some other? Is there something wrong with using zeros? > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > A dummy packet is defined only by the fact that it is a dummy and the > requested length. No further metadata should be needed or required, so I'd be > happy to see that requirement dropped. I view this as a building block for a > TBD higher-level configuration service that would simply permit IPsec to be > configured to provide a requested level of traffic masking and leave it up to > the implementation to decide how best and most efficiently to achieve that > via a combination of TFC dummy packets and padding. >> Dmitry Eremin-Solenikov(lumag) wrote: >> Well, it's just a magic number. Same as 0 would be. >>> Dmitry Eremin-Solenikov(lumag) wrote: >>> Hmm. We've spent several minutes on this, but nobody reminded of API >>> convention. Should we change the spec here? @psavol what is your opinion? Dmitry Eremin-Solenikov(lumag) wrote: We require that an application sets `l3_offset` for all packet it pushes to IPsec. For TFC dummy packets it resulted in `l3_offset` being set but ignored. Thus I proposed this change. Other solution might be to stop requiring `l3_offset` for TFC dummy packets. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Should `0xa5` be a `#define` rather than a "magic number"? >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Nit: could use `odp_unlikely()` here. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> This change requires an API change as the spec says relevant offsets >>> must be in the range `0..odp_packet_len(pkt) - 1` . Same comment for >>> the L3 and L4 changes in this patch. >>> >>> In theory the validation tests should test these bounds, but as with >>> most parts of the API violations simply result in undefined behavior, >>> so this is an "honor system". Still, we can't violate the spec here >>> without changing the spec. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Unless it has been parsed, `odp_packet_l3_offset()` is initialized to `ODP_PACKET_OFFSET_INVALID`, so this seems an undue burden. The original wording seems cleaner from an application perspective. https://github.com/Linaro/odp/pull/494#discussion_r170133989 updated_at 2018-02-23 00:16:05
Re: [lng-odp] [PATCH API-NEXT v1] api: packet: data and segment length
Dmitry Eremin-Solenikov(lumag) replied on github web page: include/odp/api/spec/packet.h line 28 @@ -401,30 +401,39 @@ uint32_t odp_packet_buf_len(odp_packet_t pkt); /** * Packet data pointer * - * Returns the current packet data pointer. When a packet is received - * from packet input, this points to the first byte of the received - * packet. Packet level offsets are calculated relative to this position. + * Returns pointer to the first byte of packet data. When packet is segmented, + * only a portion of packet data follows the pointer. When unsure, use e.g. + * odp_packet_seg_len() to check the data length following the pointer. Packet + * level offsets are calculated relative to this position. * - * User can adjust the data pointer with head_push/head_pull (does not modify - * segmentation) and add_data/rem_data calls (may modify segmentation). + * When a packet is received from packet input, this points to the first byte + * of the received packet. Pool configuration parameters may be used to ensure + * that the first packet segment contains all/most of the data relevant to the + * application. + * + * User can adjust the data pointer with e.g. push_head/pull_head (does not + * modify segmentation) and extend_head/trunc_head (may modify segmentation) + * calls. * * @param pkt Packet handle * * @return Pointer to the packet data * - * @see odp_packet_l2_ptr(), odp_packet_seg_len() + * @see odp_packet_seg_len(), odp_packet_push_head(), odp_packet_extend_head() Comment: And here too, please. > Dmitry Eremin-Solenikov(lumag) wrote: > odp_packet_seg_len() **or odp_packet_offset()**, if you don't mind. https://github.com/Linaro/odp/pull/497#discussion_r170095913 updated_at 2018-02-22 21:18:09
Re: [lng-odp] [PATCH API-NEXT v1] api: packet: data and segment length
Dmitry Eremin-Solenikov(lumag) replied on github web page: include/odp/api/spec/packet.h line 9 @@ -401,30 +401,39 @@ uint32_t odp_packet_buf_len(odp_packet_t pkt); /** * Packet data pointer * - * Returns the current packet data pointer. When a packet is received - * from packet input, this points to the first byte of the received - * packet. Packet level offsets are calculated relative to this position. + * Returns pointer to the first byte of packet data. When packet is segmented, + * only a portion of packet data follows the pointer. When unsure, use e.g. + * odp_packet_seg_len() to check the data length following the pointer. Packet Comment: odp_packet_seg_len() **or odp_packet_offset()**, if you don't mind. https://github.com/Linaro/odp/pull/497#discussion_r170095823 updated_at 2018-02-22 21:17:40
[lng-odp] [PATCH v5 2/2] validation: pool: verify pool data range
From: Michal MazurAllocate maximum number of packets from pool and verify that packet data are located inside range returned by odp_pool_info. Signed-off-by: Michal Mazur --- /** Email created from pull request 495 (semihalf-mazur-michal:master) ** https://github.com/Linaro/odp/pull/495 ** Patch: https://github.com/Linaro/odp/pull/495.patch ** Base sha: 5a58bbf2bb331fd7dde2ebbc0430634ace6900fb ** Merge commit sha: 002dda1fe241b7d56834904f064161bc4f7857af **/ test/validation/api/pool/pool.c | 45 + 1 file changed, 45 insertions(+) diff --git a/test/validation/api/pool/pool.c b/test/validation/api/pool/pool.c index 34f973573..3e84cd2d6 100644 --- a/test/validation/api/pool/pool.c +++ b/test/validation/api/pool/pool.c @@ -217,6 +217,50 @@ static void pool_test_info_packet(void) CU_ASSERT(odp_pool_destroy(pool) == 0); } +static void pool_test_info_data_range(void) +{ + odp_pool_t pool; + odp_pool_info_t info; + odp_pool_param_t param; + odp_packet_t pkt[PKT_NUM]; + uint32_t i, num; + uintptr_t pkt_data, pool_len; + + odp_pool_param_init(); + + param.type = ODP_POOL_PACKET; + param.pkt.num = PKT_NUM; + param.pkt.len = PKT_LEN; + + pool = odp_pool_create(NULL, ); + CU_ASSERT_FATAL(pool != ODP_POOL_INVALID); + + CU_ASSERT_FATAL(odp_pool_info(pool, ) == 0); + + pool_len = info.max_data_addr - info.min_data_addr + 1; + CU_ASSERT(pool_len >= PKT_NUM * PKT_LEN); + + num = 0; + + for (i = 0; i < PKT_NUM; i++) { + pkt[num] = odp_packet_alloc(pool, PKT_LEN); + CU_ASSERT(pkt[num] != ODP_PACKET_INVALID); + + if (pkt[num] != ODP_PACKET_INVALID) + num++; + } + + for (i = 0; i < num; i++) { + pkt_data = (uintptr_t)odp_packet_data(pkt[i]); + CU_ASSERT((pkt_data >= info.min_data_addr) && + (pkt_data + PKT_LEN - 1 <= info.max_data_addr)); + + odp_packet_free(pkt[i]); + } + + CU_ASSERT(odp_pool_destroy(pool) == 0); +} + odp_testinfo_t pool_suite[] = { ODP_TEST_INFO(pool_test_create_destroy_buffer), ODP_TEST_INFO(pool_test_create_destroy_packet), @@ -225,6 +269,7 @@ odp_testinfo_t pool_suite[] = { ODP_TEST_INFO(pool_test_alloc_packet_subparam), ODP_TEST_INFO(pool_test_info_packet), ODP_TEST_INFO(pool_test_lookup_info_print), + ODP_TEST_INFO(pool_test_info_data_range), ODP_TEST_INFO_NULL, };
[lng-odp] [PATCH v5 1/2] linux-generic: pool: Return address range in pool info
From: Michal MazurImplement support in odp_pool_info function to provide address range of pool data available to application. Pull request of related API change: https://github.com/Linaro/odp/pull/200 Signed-off-by: Michal Mazur --- /** Email created from pull request 495 (semihalf-mazur-michal:master) ** https://github.com/Linaro/odp/pull/495 ** Patch: https://github.com/Linaro/odp/pull/495.patch ** Base sha: 5a58bbf2bb331fd7dde2ebbc0430634ace6900fb ** Merge commit sha: 002dda1fe241b7d56834904f064161bc4f7857af **/ platform/linux-generic/odp_pool.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/platform/linux-generic/odp_pool.c b/platform/linux-generic/odp_pool.c index e5ba8982a..03578135c 100644 --- a/platform/linux-generic/odp_pool.c +++ b/platform/linux-generic/odp_pool.c @@ -693,6 +693,9 @@ int odp_pool_info(odp_pool_t pool_hdl, odp_pool_info_t *info) if (pool->params.type == ODP_POOL_PACKET) info->pkt.max_num = pool->num; + info->min_data_addr = (uintptr_t)pool->base_addr; + info->max_data_addr = (uintptr_t)pool->base_addr + pool->shm_size - 1; + return 0; }
[lng-odp] [PATCH v5 0/2] linux-generic: pool: Return address range in pool info
Implement support in odp_pool_info function to provide address range of pool data available to application. Similar change was already merged to caterpillar/linux-dpdk: #400 Pull request of related API change: #200 github /** Email created from pull request 495 (semihalf-mazur-michal:master) ** https://github.com/Linaro/odp/pull/495 ** Patch: https://github.com/Linaro/odp/pull/495.patch ** Base sha: 5a58bbf2bb331fd7dde2ebbc0430634ace6900fb ** Merge commit sha: 002dda1fe241b7d56834904f064161bc4f7857af **/ /github checkpatch.pl total: 0 errors, 0 warnings, 0 checks, 9 lines checked to_send-p-000.patch has no obvious style problems and is ready for submission. total: 0 errors, 0 warnings, 0 checks, 57 lines checked to_send-p-001.patch has no obvious style problems and is ready for submission. /checkpatch.pl
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/include/odp_ring_st_internal.h line 24 @@ -0,0 +1,118 @@ +/* Copyright (c) 2018, Linaro Limited + * All rights reserved. + * + * SPDX-License-Identifier: BSD-3-Clause + */ + +#ifndef ODP_RING_ST_INTERNAL_H_ +#define ODP_RING_ST_INTERNAL_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +/* Basic ring for single thread usage. Operations must be synchronized by using + * locks (or other means), when multiple threads use the same ring. */ +typedef struct { + uint32_t head; + uint32_t tail; + uint32_t mask; + uint32_t *data; + Comment: OK > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > OK, the `ODP_ASSERT()` would still be useful for debugging. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> That functionality is not obvious from the name. It either implies that one >> of the input arguments is written (not true here) or the reader might assume >> that it is an expression without side-effect and should be deleted (what I >> originally thought when reading it). You should pick a routine name that >> makes it clear it's actually doing something real, in this case performing >> prefetch processing. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Elsewhere you write `if (ring_st_is_empty(...))`, not `if >>> (ring_st_is_empty(...) == 1)` so this is inconsistent. Petri Savolainen(psavol) wrote: Didn't try larger than 32. 32 is already quite large from QoS point of view. I'm planning to use config file for run time tunning, so this hard coding may change in that phase. > Petri Savolainen(psavol) wrote: > One entry is not lost. >> Petri Savolainen(psavol) wrote: >> One entry is not lost. User provided size if not (currently) used. Queue >> size is always 4k. >>> Petri Savolainen(psavol) wrote: >>> One entry is not lost. Petri Savolainen(psavol) wrote: One entry is not lost. > Petri Savolainen(psavol) wrote: > OK, added checks in v2. >> Petri Savolainen(psavol) wrote: >> OK. Compiler probably did that already, but changed in v2. >>> Petri Savolainen(psavol) wrote: >>> Tail and head indexes are (masked from) uint32_t and do not wrap >>> around when the ring is full. I think you assume that the store >>> index is 0...size-1, while it's full uint32_t which is then masked >>> to get the actual index. >>> >>> For example: >>> size = 100; >>> >>> Empty: >>> head = 100 >>> tail = 100 >>> num = 100 - 100 = 0 >>> >>> Full: >>> head = 100 >>> tail = 200 >>> num = 200 - 100 = 100 >>> >>> Wrap uint32_t + full: >>> head = 0xFF9C >>> tail = 0 >>> num = 0 - 0xFF9C = 0x64 = 100 >>> >>> So, no abs() needed. Ring size can be 4096, instead of 4095. Petri Savolainen(psavol) wrote: It's already documented 5 lines above: /* Initialize ring. Ring size must be a power of two. */ static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t size) { > Petri Savolainen(psavol) wrote: > This function converts 32 bit buffer indexes to buffer header > pointers. The counter operation is buffer_index_from_buf(). The > prefetch is a side effect of the function, which may be > changed/moved any time if it's found out that there's a place for > prefetching. I actually plan to test if number of prefetches > should be limited as e.g. 32 consecutive prefetches may be too > much for some CPU architectures. >> Petri Savolainen(psavol) wrote: >> I prefer style where '== 0' is used instead of '!'. Especially, >> when the if clause is as complex as this and there's danger for >> reader to miss the '!' sign. >>> Petri Savolainen(psavol) wrote: >>> It's there to ensure that all bits are zero also when someone >>> would modify the bitfield from two to three fields later on. >>> Similarly to memset() zero is used for struct inits. Petri Savolainen(psavol) wrote: There's no need for abs(). Since it's all uint32_t variables, wrap a round is handled already. An example in 8bits: 0xff - 0xfd = 0x02 0x00 - 0xfe = 0x02 0x01 - 0xff = 0x02 0x02 - 0x00 = 0x02 This passes both gcc and clang, and is used already in the other ring implementation see ring_deq_multi().
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/include/odp_ring_st_internal.h line 32 @@ -0,0 +1,118 @@ +/* Copyright (c) 2018, Linaro Limited + * All rights reserved. + * + * SPDX-License-Identifier: BSD-3-Clause + */ + +#ifndef ODP_RING_ST_INTERNAL_H_ +#define ODP_RING_ST_INTERNAL_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +/* Basic ring for single thread usage. Operations must be synchronized by using + * locks (or other means), when multiple threads use the same ring. */ +typedef struct { + uint32_t head; + uint32_t tail; + uint32_t mask; + uint32_t *data; + +} ring_st_t; + +/* Initialize ring. Ring size must be a power of two. */ +static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t size) +{ + ring->head = 0; + ring->tail = 0; + ring->mask = size - 1; Comment: OK, the `ODP_ASSERT()` would still be useful for debugging. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > That functionality is not obvious from the name. It either implies that one > of the input arguments is written (not true here) or the reader might assume > that it is an expression without side-effect and should be deleted (what I > originally thought when reading it). You should pick a routine name that > makes it clear it's actually doing something real, in this case performing > prefetch processing. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Elsewhere you write `if (ring_st_is_empty(...))`, not `if >> (ring_st_is_empty(...) == 1)` so this is inconsistent. >>> Petri Savolainen(psavol) wrote: >>> Didn't try larger than 32. 32 is already quite large from QoS point of >>> view. >>> >>> I'm planning to use config file for run time tunning, so this hard coding >>> may change in that phase. Petri Savolainen(psavol) wrote: One entry is not lost. > Petri Savolainen(psavol) wrote: > One entry is not lost. User provided size if not (currently) used. Queue > size is always 4k. >> Petri Savolainen(psavol) wrote: >> One entry is not lost. >>> Petri Savolainen(psavol) wrote: >>> One entry is not lost. Petri Savolainen(psavol) wrote: OK, added checks in v2. > Petri Savolainen(psavol) wrote: > OK. Compiler probably did that already, but changed in v2. >> Petri Savolainen(psavol) wrote: >> Tail and head indexes are (masked from) uint32_t and do not wrap >> around when the ring is full. I think you assume that the store >> index is 0...size-1, while it's full uint32_t which is then masked >> to get the actual index. >> >> For example: >> size = 100; >> >> Empty: >> head = 100 >> tail = 100 >> num = 100 - 100 = 0 >> >> Full: >> head = 100 >> tail = 200 >> num = 200 - 100 = 100 >> >> Wrap uint32_t + full: >> head = 0xFF9C >> tail = 0 >> num = 0 - 0xFF9C = 0x64 = 100 >> >> So, no abs() needed. Ring size can be 4096, instead of 4095. >>> Petri Savolainen(psavol) wrote: >>> It's already documented 5 lines above: >>> >>> /* Initialize ring. Ring size must be a power of two. */ >>> static inline void ring_st_init(ring_st_t *ring, uint32_t *data, >>> uint32_t size) >>> { Petri Savolainen(psavol) wrote: This function converts 32 bit buffer indexes to buffer header pointers. The counter operation is buffer_index_from_buf(). The prefetch is a side effect of the function, which may be changed/moved any time if it's found out that there's a place for prefetching. I actually plan to test if number of prefetches should be limited as e.g. 32 consecutive prefetches may be too much for some CPU architectures. > Petri Savolainen(psavol) wrote: > I prefer style where '== 0' is used instead of '!'. Especially, > when the if clause is as complex as this and there's danger for > reader to miss the '!' sign. >> Petri Savolainen(psavol) wrote: >> It's there to ensure that all bits are zero also when someone >> would modify the bitfield from two to three fields later on. >> Similarly to memset() zero is used for struct inits. >>> Petri Savolainen(psavol) wrote: >>> There's no need for abs(). Since it's all uint32_t variables, >>> wrap a round is handled already. >>> An example in 8bits: >>> 0xff - 0xfd = 0x02 >>> 0x00 - 0xfe = 0x02 >>> 0x01 - 0xff = 0x02 >>> 0x02 - 0x00 = 0x02 >>> >>> This passes both gcc and clang, and is
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/odp_queue.c @@ -471,51 +476,18 @@ static inline int deq_multi(queue_entry_t *queue, odp_buffer_hdr_t *buf_hdr[], } UNLOCK(>s.lock); - return 0; - } - - for (i = 0; i < num && hdr; ) { - int burst_num = hdr->burst_num; - int first = hdr->burst_first; - /* First, get bursted buffers */ - for (j = 0; j < burst_num && i < num; j++, i++) { - buf_hdr[i] = hdr->burst[first + j]; - odp_prefetch(buf_hdr[i]); - } - - if (burst_num) { - hdr->burst_num = burst_num - j; - hdr->burst_first = first + j; - } - - if (i == num) - break; - - /* When burst is empty, consume the current buffer header and -* move to the next header */ - buf_hdr[i] = hdr; - next = hdr->next; - hdr->next = NULL; - hdr= next; - updated++; - i++; + return 0; } - /* Write head only if updated */ - if (updated) - queue->s.head = hdr; - - /* Queue is empty */ - if (hdr == NULL) - queue->s.tail = NULL; - if (status_sync && queue->s.type == ODP_QUEUE_TYPE_SCHED) sched_fn->save_context(queue->s.index); UNLOCK(>s.lock); - return i; + buffer_index_to_buf(buf_hdr, buf_idx, num_deq); Comment: That functionality is not obvious from the name. It either implies that one of the input arguments is written (not true here) or the reader might assume that it is an expression without side-effect and should be deleted (what I originally thought when reading it). You should pick a routine name that makes it clear it's actually doing something real, in this case performing prefetch processing. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Elsewhere you write `if (ring_st_is_empty(...))`, not `if > (ring_st_is_empty(...) == 1)` so this is inconsistent. >> Petri Savolainen(psavol) wrote: >> Didn't try larger than 32. 32 is already quite large from QoS point of view. >> >> I'm planning to use config file for run time tunning, so this hard coding >> may change in that phase. >>> Petri Savolainen(psavol) wrote: >>> One entry is not lost. Petri Savolainen(psavol) wrote: One entry is not lost. User provided size if not (currently) used. Queue size is always 4k. > Petri Savolainen(psavol) wrote: > One entry is not lost. >> Petri Savolainen(psavol) wrote: >> One entry is not lost. >>> Petri Savolainen(psavol) wrote: >>> OK, added checks in v2. Petri Savolainen(psavol) wrote: OK. Compiler probably did that already, but changed in v2. > Petri Savolainen(psavol) wrote: > Tail and head indexes are (masked from) uint32_t and do not wrap > around when the ring is full. I think you assume that the store index > is 0...size-1, while it's full uint32_t which is then masked to get > the actual index. > > For example: > size = 100; > > Empty: > head = 100 > tail = 100 > num = 100 - 100 = 0 > > Full: > head = 100 > tail = 200 > num = 200 - 100 = 100 > > Wrap uint32_t + full: > head = 0xFF9C > tail = 0 > num = 0 - 0xFF9C = 0x64 = 100 > > So, no abs() needed. Ring size can be 4096, instead of 4095. >> Petri Savolainen(psavol) wrote: >> It's already documented 5 lines above: >> >> /* Initialize ring. Ring size must be a power of two. */ >> static inline void ring_st_init(ring_st_t *ring, uint32_t *data, >> uint32_t size) >> { >>> Petri Savolainen(psavol) wrote: >>> This function converts 32 bit buffer indexes to buffer header >>> pointers. The counter operation is buffer_index_from_buf(). The >>> prefetch is a side effect of the function, which may be >>> changed/moved any time if it's found out that there's a place for >>> prefetching. I actually plan to test if number of prefetches should >>> be limited as e.g. 32 consecutive prefetches may be too much for >>> some CPU architectures. Petri Savolainen(psavol) wrote: I prefer style where '== 0' is used instead of '!'. Especially, when the if clause is as complex as this and there's danger for reader to miss the '!' sign. > Petri Savolainen(psavol) wrote: > It's there to ensure that all bits are zero also when
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/odp_queue.c @@ -263,7 +275,7 @@ static int queue_destroy(odp_queue_t handle) ODP_ERR("queue \"%s\" already destroyed\n", queue->s.name); return -1; } - if (queue->s.head != NULL) { + if (ring_st_is_empty(>s.ring_st) == 0) { Comment: Elsewhere you write `if (ring_st_is_empty(...))`, not `if (ring_st_is_empty(...) == 1)` so this is inconsistent. > Petri Savolainen(psavol) wrote: > Didn't try larger than 32. 32 is already quite large from QoS point of view. > > I'm planning to use config file for run time tunning, so this hard coding may > change in that phase. >> Petri Savolainen(psavol) wrote: >> One entry is not lost. >>> Petri Savolainen(psavol) wrote: >>> One entry is not lost. User provided size if not (currently) used. Queue >>> size is always 4k. Petri Savolainen(psavol) wrote: One entry is not lost. > Petri Savolainen(psavol) wrote: > One entry is not lost. >> Petri Savolainen(psavol) wrote: >> OK, added checks in v2. >>> Petri Savolainen(psavol) wrote: >>> OK. Compiler probably did that already, but changed in v2. Petri Savolainen(psavol) wrote: Tail and head indexes are (masked from) uint32_t and do not wrap around when the ring is full. I think you assume that the store index is 0...size-1, while it's full uint32_t which is then masked to get the actual index. For example: size = 100; Empty: head = 100 tail = 100 num = 100 - 100 = 0 Full: head = 100 tail = 200 num = 200 - 100 = 100 Wrap uint32_t + full: head = 0xFF9C tail = 0 num = 0 - 0xFF9C = 0x64 = 100 So, no abs() needed. Ring size can be 4096, instead of 4095. > Petri Savolainen(psavol) wrote: > It's already documented 5 lines above: > > /* Initialize ring. Ring size must be a power of two. */ > static inline void ring_st_init(ring_st_t *ring, uint32_t *data, > uint32_t size) > { >> Petri Savolainen(psavol) wrote: >> This function converts 32 bit buffer indexes to buffer header >> pointers. The counter operation is buffer_index_from_buf(). The >> prefetch is a side effect of the function, which may be >> changed/moved any time if it's found out that there's a place for >> prefetching. I actually plan to test if number of prefetches should >> be limited as e.g. 32 consecutive prefetches may be too much for >> some CPU architectures. >>> Petri Savolainen(psavol) wrote: >>> I prefer style where '== 0' is used instead of '!'. Especially, >>> when the if clause is as complex as this and there's danger for >>> reader to miss the '!' sign. Petri Savolainen(psavol) wrote: It's there to ensure that all bits are zero also when someone would modify the bitfield from two to three fields later on. Similarly to memset() zero is used for struct inits. > Petri Savolainen(psavol) wrote: > There's no need for abs(). Since it's all uint32_t variables, > wrap a round is handled already. > An example in 8bits: > 0xff - 0xfd = 0x02 > 0x00 - 0xfe = 0x02 > 0x01 - 0xff = 0x02 > 0x02 - 0x00 = 0x02 > > This passes both gcc and clang, and is used already in the other > ring implementation see ring_deq_multi(). >> Petri Savolainen(psavol) wrote: >> I prefer style with blank line in the end of a typedef, since >> it's easier to spot the type name (as it's not mixed into struct >> field names). Checkpatch passes so this style should be OK. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Does this mean that sizes larger than 32 have no added >>> performance benefit? Bill Fischofer(Bill-Fischofer-Linaro) wrote: Must use `CONFIG_QUEUE_SIZE - 1` here, as noted earlier, if we're not going to use the user-supplied queue size. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Given its name, this looks like an extraneous statement that > should be deleted. Renaming this to something like > `prefetch_dequeued_bufs()` would make the intent clearer here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> `if (!ring_st_is_empty(>s.ring_st))` seems more >> natural here. >>> Bill Fischofer(Bill-Fischofer-Linaro)
Re: [lng-odp] [PATCH v4] linux-generic: pool: Return address range in pool info
semihalf-mazur-michal replied on github web page: test/validation/api/pool/pool.c @@ -217,6 +217,50 @@ static void pool_test_info_packet(void) CU_ASSERT(odp_pool_destroy(pool) == 0); } +static void pool_test_info_data_range(void) +{ + odp_pool_t pool; + odp_pool_info_t info; + odp_pool_param_t param; + odp_packet_t pkt[PKT_NUM]; + uint32_t i, num; + uintptr_t pkt_data, pool_len; + + odp_pool_param_init(); + + param.type = ODP_POOL_PACKET; + param.pkt.num = PKT_NUM; + param.pkt.len = PKT_LEN; + + pool = odp_pool_create(NULL, ); + CU_ASSERT_FATAL(pool != ODP_POOL_INVALID); + + CU_ASSERT_FATAL(odp_pool_info(pool, ) == 0); + + pool_len = info.max_data_addr - info.min_data_addr + 1; + CU_ASSERT_FATAL(pool_len >= PKT_NUM * PKT_LEN); + + num = 0; + + for (i = 0; i < PKT_NUM; i++) { + pkt[num] = odp_packet_alloc(pool, PKT_LEN); + CU_ASSERT(pkt[num] != ODP_PACKET_INVALID); + + if (pkt[num] != ODP_PACKET_INVALID) + num++; + } + + for (i = 0; i < num; i++) { + pkt_data = (uintptr_t)odp_packet_data(pkt[i]); + CU_ASSERT_FATAL((pkt_data >= info.min_data_addr) && Comment: Fixed in v4 > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > I'd make this `CU_ASSERT()` rather than `CU_ASSERT_FATAL()` so all > discrepancies can be caught. `CU_ASSERT_FATAL()` is reserved for setup > failures that invalidate the entire test (_e.g.,_ not being able to create > the pool, `odp_pool_info()` reporting an error, etc.) https://github.com/Linaro/odp/pull/495#discussion_r169960459 updated_at 2018-02-22 13:53:17
[lng-odp] [PATCH API-NEXT v1 4/4] validation: packet: test packet_data_seg_len
From: Petri SavolainenTest the new combined packet data and seg_len function. Signed-off-by: Petri Savolainen --- /** Email created from pull request 497 (psavol:next-packet-data-doc) ** https://github.com/Linaro/odp/pull/497 ** Patch: https://github.com/Linaro/odp/pull/497.patch ** Base sha: ea2afab619ae74108a03798bc358fdfcd29fdd88 ** Merge commit sha: d1c9a3d36dfe9e38ecfe7d4a52bebe13d0c01098 **/ test/validation/api/packet/packet.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/test/validation/api/packet/packet.c b/test/validation/api/packet/packet.c index f829d0cb1..78a14800c 100644 --- a/test/validation/api/packet/packet.c +++ b/test/validation/api/packet/packet.c @@ -604,7 +604,8 @@ void packet_test_basic_metadata(void) void packet_test_length(void) { odp_packet_t pkt = test_packet; - uint32_t buf_len, headroom, tailroom; + uint32_t buf_len, headroom, tailroom, seg_len; + void *data; odp_pool_capability_t capa; CU_ASSERT_FATAL(odp_pool_capability() == 0); @@ -612,8 +613,13 @@ void packet_test_length(void) buf_len = odp_packet_buf_len(pkt); headroom = odp_packet_headroom(pkt); tailroom = odp_packet_tailroom(pkt); + data = odp_packet_data(pkt); + CU_ASSERT(data != NULL); CU_ASSERT(odp_packet_len(pkt) == packet_len); + CU_ASSERT(odp_packet_seg_len(pkt) <= packet_len); + CU_ASSERT(odp_packet_data_seg_len(pkt, _len) == data); + CU_ASSERT(seg_len == odp_packet_seg_len(pkt)); CU_ASSERT(headroom >= capa.pkt.min_headroom); CU_ASSERT(tailroom >= capa.pkt.min_tailroom);
[lng-odp] [PATCH v4 2/2] validation: pool: verify pool data range
From: Michal MazurAllocate maximum number of packets from pool and verify that packet data are located inside range returned by odp_pool_info. Signed-off-by: Michal Mazur --- /** Email created from pull request 495 (semihalf-mazur-michal:master) ** https://github.com/Linaro/odp/pull/495 ** Patch: https://github.com/Linaro/odp/pull/495.patch ** Base sha: 5a58bbf2bb331fd7dde2ebbc0430634ace6900fb ** Merge commit sha: f5123472c155c4cfbeb784ad58088c523ff751b5 **/ test/validation/api/pool/pool.c | 45 + 1 file changed, 45 insertions(+) diff --git a/test/validation/api/pool/pool.c b/test/validation/api/pool/pool.c index 34f973573..fa7e65f41 100644 --- a/test/validation/api/pool/pool.c +++ b/test/validation/api/pool/pool.c @@ -217,6 +217,50 @@ static void pool_test_info_packet(void) CU_ASSERT(odp_pool_destroy(pool) == 0); } +static void pool_test_info_data_range(void) +{ + odp_pool_t pool; + odp_pool_info_t info; + odp_pool_param_t param; + odp_packet_t pkt[PKT_NUM]; + uint32_t i, num; + uintptr_t pkt_data, pool_len; + + odp_pool_param_init(); + + param.type = ODP_POOL_PACKET; + param.pkt.num = PKT_NUM; + param.pkt.len = PKT_LEN; + + pool = odp_pool_create(NULL, ); + CU_ASSERT_FATAL(pool != ODP_POOL_INVALID); + + CU_ASSERT_FATAL(odp_pool_info(pool, ) == 0); + + pool_len = info.max_data_addr - info.min_data_addr + 1; + CU_ASSERT(pool_len >= PKT_NUM * PKT_LEN); + + num = 0; + + for (i = 0; i < PKT_NUM; i++) { + pkt[num] = odp_packet_alloc(pool, PKT_LEN); + CU_ASSERT(pkt[num] != ODP_PACKET_INVALID); + + if (pkt[num] != ODP_PACKET_INVALID) + num++; + } + + for (i = 0; i < num; i++) { + pkt_data = (uintptr_t)odp_packet_data(pkt[i]); + CU_ASSERT((pkt_data >= info.min_data_addr) && + (pkt_data + PKT_LEN - 1 <= info.max_data_addr)); + + odp_packet_free(pkt[i]); + } + + CU_ASSERT(odp_pool_destroy(pool) == 0); +} + odp_testinfo_t pool_suite[] = { ODP_TEST_INFO(pool_test_create_destroy_buffer), ODP_TEST_INFO(pool_test_create_destroy_packet), @@ -225,6 +269,7 @@ odp_testinfo_t pool_suite[] = { ODP_TEST_INFO(pool_test_alloc_packet_subparam), ODP_TEST_INFO(pool_test_info_packet), ODP_TEST_INFO(pool_test_lookup_info_print), + ODP_TEST_INFO(pool_test_info_data_range), ODP_TEST_INFO_NULL, };
[lng-odp] [PATCH v4 1/2] linux-generic: pool: Return address range in pool info
From: Michal MazurImplement support in odp_pool_info function to provide address range of pool data available to application. Pull request of related API change: https://github.com/Linaro/odp/pull/200 Signed-off-by: Michal Mazur --- /** Email created from pull request 495 (semihalf-mazur-michal:master) ** https://github.com/Linaro/odp/pull/495 ** Patch: https://github.com/Linaro/odp/pull/495.patch ** Base sha: 5a58bbf2bb331fd7dde2ebbc0430634ace6900fb ** Merge commit sha: f5123472c155c4cfbeb784ad58088c523ff751b5 **/ platform/linux-generic/odp_pool.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/platform/linux-generic/odp_pool.c b/platform/linux-generic/odp_pool.c index e5ba8982a..03578135c 100644 --- a/platform/linux-generic/odp_pool.c +++ b/platform/linux-generic/odp_pool.c @@ -693,6 +693,9 @@ int odp_pool_info(odp_pool_t pool_hdl, odp_pool_info_t *info) if (pool->params.type == ODP_POOL_PACKET) info->pkt.max_num = pool->num; + info->min_data_addr = (uintptr_t)pool->base_addr; + info->max_data_addr = (uintptr_t)pool->base_addr + pool->shm_size - 1; + return 0; }
[lng-odp] [PATCH v4 0/2] linux-generic: pool: Return address range in pool info
Implement support in odp_pool_info function to provide address range of pool data available to application. Similar change was already merged to caterpillar/linux-dpdk: #400 Pull request of related API change: #200 github /** Email created from pull request 495 (semihalf-mazur-michal:master) ** https://github.com/Linaro/odp/pull/495 ** Patch: https://github.com/Linaro/odp/pull/495.patch ** Base sha: 5a58bbf2bb331fd7dde2ebbc0430634ace6900fb ** Merge commit sha: f5123472c155c4cfbeb784ad58088c523ff751b5 **/ /github checkpatch.pl total: 0 errors, 0 warnings, 0 checks, 9 lines checked to_send-p-000.patch has no obvious style problems and is ready for submission. CHECK: Alignment should match open parenthesis #65: FILE: test/validation/api/pool/pool.c:256: + CU_ASSERT((pkt_data >= info.min_data_addr) && + (pkt_data + PKT_LEN - 1 <= info.max_data_addr)); total: 0 errors, 0 warnings, 1 checks, 57 lines checked to_send-p-001.patch has style problems, please review. If any of these errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. /checkpatch.pl
[lng-odp] [PATCH API-NEXT v1 3/4] linux-gen: packet: implement packet_data_seg_len
From: Petri SavolainenImplement the new combined packet data and seg_len function. Signed-off-by: Petri Savolainen --- /** Email created from pull request 497 (psavol:next-packet-data-doc) ** https://github.com/Linaro/odp/pull/497 ** Patch: https://github.com/Linaro/odp/pull/497.patch ** Base sha: ea2afab619ae74108a03798bc358fdfcd29fdd88 ** Merge commit sha: d1c9a3d36dfe9e38ecfe7d4a52bebe13d0c01098 **/ platform/linux-generic/include/odp/api/plat/packet_inlines.h | 7 +++ platform/linux-generic/include/odp/api/plat/packet_inlines_api.h | 5 + 2 files changed, 12 insertions(+) diff --git a/platform/linux-generic/include/odp/api/plat/packet_inlines.h b/platform/linux-generic/include/odp/api/plat/packet_inlines.h index b6b493363..ae90ec5bf 100644 --- a/platform/linux-generic/include/odp/api/plat/packet_inlines.h +++ b/platform/linux-generic/include/odp/api/plat/packet_inlines.h @@ -63,6 +63,13 @@ static inline uint32_t _odp_packet_seg_len(odp_packet_t pkt) return _odp_pkt_get(pkt, uint32_t, seg_len); } +static inline void *_odp_packet_data_seg_len(odp_packet_t pkt, +uint32_t *seg_len) +{ + *seg_len = _odp_packet_seg_len(pkt); + return _odp_packet_data(pkt); +} + static inline uint32_t _odp_packet_len(odp_packet_t pkt) { return _odp_pkt_get(pkt, uint32_t, frame_len); diff --git a/platform/linux-generic/include/odp/api/plat/packet_inlines_api.h b/platform/linux-generic/include/odp/api/plat/packet_inlines_api.h index d0f3adc12..76210e005 100644 --- a/platform/linux-generic/include/odp/api/plat/packet_inlines_api.h +++ b/platform/linux-generic/include/odp/api/plat/packet_inlines_api.h @@ -23,6 +23,11 @@ _ODP_INLINE uint32_t odp_packet_seg_len(odp_packet_t pkt) return _odp_packet_seg_len(pkt); } +_ODP_INLINE void *odp_packet_data_seg_len(odp_packet_t pkt, uint32_t *seg_len) +{ + return _odp_packet_data_seg_len(pkt, seg_len); +} + _ODP_INLINE uint32_t odp_packet_len(odp_packet_t pkt) { return _odp_packet_len(pkt);
[lng-odp] [PATCH API-NEXT v1 2/4] api: packet: add combined packet data and seg len
From: Petri SavolainenPacket data pointer and segment length used often. Combine two calls into one call. One call performs better in ABI compatible mode than two calls. Signed-off-by: Petri Savolainen --- /** Email created from pull request 497 (psavol:next-packet-data-doc) ** https://github.com/Linaro/odp/pull/497 ** Patch: https://github.com/Linaro/odp/pull/497.patch ** Base sha: ea2afab619ae74108a03798bc358fdfcd29fdd88 ** Merge commit sha: d1c9a3d36dfe9e38ecfe7d4a52bebe13d0c01098 **/ include/odp/api/spec/packet.h | 16 1 file changed, 16 insertions(+) diff --git a/include/odp/api/spec/packet.h b/include/odp/api/spec/packet.h index 746f6fbf7..e1f2f2218 100644 --- a/include/odp/api/spec/packet.h +++ b/include/odp/api/spec/packet.h @@ -439,6 +439,22 @@ void *odp_packet_data(odp_packet_t pkt); */ uint32_t odp_packet_seg_len(odp_packet_t pkt); +/** + * Packet data pointer with segment length + * + * Returns both data pointer and number of data bytes (in the segment) + * following it. This is equivalent to calling odp_packet_data() and + * odp_packet_seg_len(). + * + * @param pkt Packet handle + * @param[out] seg_len Pointer to output segment length + * + * @return Pointer to the packet data + * + * @see odp_packet_data(), odp_packet_seg_len() + */ +void *odp_packet_data_seg_len(odp_packet_t pkt, uint32_t *seg_len); + /** * Packet data length *
[lng-odp] [PATCH API-NEXT v1 1/4] api: packet: improve segmented packet documentation
From: Petri SavolainenImprove documentation text to be more explicit that packets may be segmented. Signed-off-by: Petri Savolainen --- /** Email created from pull request 497 (psavol:next-packet-data-doc) ** https://github.com/Linaro/odp/pull/497 ** Patch: https://github.com/Linaro/odp/pull/497.patch ** Base sha: ea2afab619ae74108a03798bc358fdfcd29fdd88 ** Merge commit sha: d1c9a3d36dfe9e38ecfe7d4a52bebe13d0c01098 **/ include/odp/api/spec/packet.h | 34 +++--- 1 file changed, 23 insertions(+), 11 deletions(-) diff --git a/include/odp/api/spec/packet.h b/include/odp/api/spec/packet.h index 079a1ae1b..746f6fbf7 100644 --- a/include/odp/api/spec/packet.h +++ b/include/odp/api/spec/packet.h @@ -401,30 +401,39 @@ uint32_t odp_packet_buf_len(odp_packet_t pkt); /** * Packet data pointer * - * Returns the current packet data pointer. When a packet is received - * from packet input, this points to the first byte of the received - * packet. Packet level offsets are calculated relative to this position. + * Returns pointer to the first byte of packet data. When packet is segmented, + * only a portion of packet data follows the pointer. When unsure, use e.g. + * odp_packet_seg_len() to check the data length following the pointer. Packet + * level offsets are calculated relative to this position. * - * User can adjust the data pointer with head_push/head_pull (does not modify - * segmentation) and add_data/rem_data calls (may modify segmentation). + * When a packet is received from packet input, this points to the first byte + * of the received packet. Pool configuration parameters may be used to ensure + * that the first packet segment contains all/most of the data relevant to the + * application. + * + * User can adjust the data pointer with e.g. push_head/pull_head (does not + * modify segmentation) and extend_head/trunc_head (may modify segmentation) + * calls. * * @param pkt Packet handle * * @return Pointer to the packet data * - * @see odp_packet_l2_ptr(), odp_packet_seg_len() + * @see odp_packet_seg_len(), odp_packet_push_head(), odp_packet_extend_head() */ void *odp_packet_data(odp_packet_t pkt); /** - * Packet segment data length + * Packet data length following the data pointer * - * Returns number of data bytes following the current data pointer - * (odp_packet_data()) location in the segment. + * Returns number of data bytes (in the segment) following the current data + * pointer position. When unsure, use this function to check how many bytes + * can be accessed linearly after data pointer (odp_packet_data()). This + * equals to odp_packet_len() for single segment packets. * * @param pkt Packet handle * - * @return Segment data length in bytes (pointed by odp_packet_data()) + * @return Segment data length in bytes following odp_packet_data() * * @see odp_packet_data() */ @@ -433,11 +442,14 @@ uint32_t odp_packet_seg_len(odp_packet_t pkt); /** * Packet data length * - * Returns sum of data lengths over all packet segments. + * Returns total data length over all packet segments. This equals the sum of + * segment level data lengths (odp_packet_seg_data_len()). * * @param pkt Packet handle * * @return Packet data length + * + * @see odp_packet_seg_len(), odp_packet_data(), odp_packet_seg_data_len() */ uint32_t odp_packet_len(odp_packet_t pkt);
[lng-odp] [PATCH API-NEXT v1 0/4] api: packet: data and segment length
Improved documentation and added new combined function for better performance in ABI compat mode. github /** Email created from pull request 497 (psavol:next-packet-data-doc) ** https://github.com/Linaro/odp/pull/497 ** Patch: https://github.com/Linaro/odp/pull/497.patch ** Base sha: ea2afab619ae74108a03798bc358fdfcd29fdd88 ** Merge commit sha: d1c9a3d36dfe9e38ecfe7d4a52bebe13d0c01098 **/ /github checkpatch.pl total: 0 errors, 0 warnings, 0 checks, 64 lines checked to_send-p-000.patch has no obvious style problems and is ready for submission. total: 0 errors, 0 warnings, 0 checks, 22 lines checked to_send-p-001.patch has no obvious style problems and is ready for submission. total: 0 errors, 0 warnings, 0 checks, 24 lines checked to_send-p-002.patch has no obvious style problems and is ready for submission. total: 0 errors, 0 warnings, 0 checks, 22 lines checked to_send-p-003.patch has no obvious style problems and is ready for submission. /checkpatch.pl
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Petri Savolainen(psavol) replied on github web page: platform/linux-generic/odp_queue.c line 420 @@ -584,8 +556,9 @@ static int queue_init(queue_entry_t *queue, const char *name, queue->s.pktin = PKTIN_INVALID; queue->s.pktout = PKTOUT_INVALID; - queue->s.head = NULL; - queue->s.tail = NULL; + ring_st_init(>s.ring_st, +queue_tbl->ring_data[queue->s.index].data, +CONFIG_QUEUE_SIZE); Comment: One entry is not lost. > Petri Savolainen(psavol) wrote: > One entry is not lost. User provided size if not (currently) used. Queue size > is always 4k. >> Petri Savolainen(psavol) wrote: >> One entry is not lost. >>> Petri Savolainen(psavol) wrote: >>> One entry is not lost. Petri Savolainen(psavol) wrote: OK, added checks in v2. > Petri Savolainen(psavol) wrote: > OK. Compiler probably did that already, but changed in v2. >> Petri Savolainen(psavol) wrote: >> Tail and head indexes are (masked from) uint32_t and do not wrap around >> when the ring is full. I think you assume that the store index is >> 0...size-1, while it's full uint32_t which is then masked to get the >> actual index. >> >> For example: >> size = 100; >> >> Empty: >> head = 100 >> tail = 100 >> num = 100 - 100 = 0 >> >> Full: >> head = 100 >> tail = 200 >> num = 200 - 100 = 100 >> >> Wrap uint32_t + full: >> head = 0xFF9C >> tail = 0 >> num = 0 - 0xFF9C = 0x64 = 100 >> >> So, no abs() needed. Ring size can be 4096, instead of 4095. >>> Petri Savolainen(psavol) wrote: >>> It's already documented 5 lines above: >>> >>> /* Initialize ring. Ring size must be a power of two. */ >>> static inline void ring_st_init(ring_st_t *ring, uint32_t *data, >>> uint32_t size) >>> { Petri Savolainen(psavol) wrote: This function converts 32 bit buffer indexes to buffer header pointers. The counter operation is buffer_index_from_buf(). The prefetch is a side effect of the function, which may be changed/moved any time if it's found out that there's a place for prefetching. I actually plan to test if number of prefetches should be limited as e.g. 32 consecutive prefetches may be too much for some CPU architectures. > Petri Savolainen(psavol) wrote: > I prefer style where '== 0' is used instead of '!'. Especially, when > the if clause is as complex as this and there's danger for reader to > miss the '!' sign. >> Petri Savolainen(psavol) wrote: >> It's there to ensure that all bits are zero also when someone would >> modify the bitfield from two to three fields later on. Similarly to >> memset() zero is used for struct inits. >>> Petri Savolainen(psavol) wrote: >>> There's no need for abs(). Since it's all uint32_t variables, wrap >>> a round is handled already. >>> An example in 8bits: >>> 0xff - 0xfd = 0x02 >>> 0x00 - 0xfe = 0x02 >>> 0x01 - 0xff = 0x02 >>> 0x02 - 0x00 = 0x02 >>> >>> This passes both gcc and clang, and is used already in the other >>> ring implementation see ring_deq_multi(). Petri Savolainen(psavol) wrote: I prefer style with blank line in the end of a typedef, since it's easier to spot the type name (as it's not mixed into struct field names). Checkpatch passes so this style should be OK. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Does this mean that sizes larger than 32 have no added > performance benefit? >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Must use `CONFIG_QUEUE_SIZE - 1` here, as noted earlier, if >> we're not going to use the user-supplied queue size. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Given its name, this looks like an extraneous statement that >>> should be deleted. Renaming this to something like >>> `prefetch_dequeued_bufs()` would make the intent clearer here. Bill Fischofer(Bill-Fischofer-Linaro) wrote: `if (!ring_st_is_empty(>s.ring_st))` seems more natural here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Change to `if (param->size >= CONFIG_QUEUE_SIZE)` to handle > the effective queue capacity. The user-supplied `size` should > then be set to `ROUNDUP_POWER2_U32(size) - 1` for the masking > to work properly. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Same comment here as for plain queues. >>> Bill
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Petri Savolainen(psavol) replied on github web page: platform/linux-generic/include/odp_config_internal.h line 5 @@ -144,7 +144,7 @@ extern "C" { * This controls the burst size on various enqueue, dequeue, etc calls. Large * burst size improves throughput, but may degrade QoS (increase latency). */ -#define CONFIG_BURST_SIZE 16 +#define CONFIG_BURST_SIZE 32 Comment: Didn't try larger than 32. 32 is already quite large from QoS point of view. I'm planning to use config file for run time tunning, so this hard coding may change in that phase. > Petri Savolainen(psavol) wrote: > One entry is not lost. >> Petri Savolainen(psavol) wrote: >> One entry is not lost. User provided size if not (currently) used. Queue >> size is always 4k. >>> Petri Savolainen(psavol) wrote: >>> One entry is not lost. Petri Savolainen(psavol) wrote: One entry is not lost. > Petri Savolainen(psavol) wrote: > OK, added checks in v2. >> Petri Savolainen(psavol) wrote: >> OK. Compiler probably did that already, but changed in v2. >>> Petri Savolainen(psavol) wrote: >>> Tail and head indexes are (masked from) uint32_t and do not wrap around >>> when the ring is full. I think you assume that the store index is >>> 0...size-1, while it's full uint32_t which is then masked to get the >>> actual index. >>> >>> For example: >>> size = 100; >>> >>> Empty: >>> head = 100 >>> tail = 100 >>> num = 100 - 100 = 0 >>> >>> Full: >>> head = 100 >>> tail = 200 >>> num = 200 - 100 = 100 >>> >>> Wrap uint32_t + full: >>> head = 0xFF9C >>> tail = 0 >>> num = 0 - 0xFF9C = 0x64 = 100 >>> >>> So, no abs() needed. Ring size can be 4096, instead of 4095. Petri Savolainen(psavol) wrote: It's already documented 5 lines above: /* Initialize ring. Ring size must be a power of two. */ static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t size) { > Petri Savolainen(psavol) wrote: > This function converts 32 bit buffer indexes to buffer header > pointers. The counter operation is buffer_index_from_buf(). The > prefetch is a side effect of the function, which may be changed/moved > any time if it's found out that there's a place for prefetching. I > actually plan to test if number of prefetches should be limited as > e.g. 32 consecutive prefetches may be too much for some CPU > architectures. >> Petri Savolainen(psavol) wrote: >> I prefer style where '== 0' is used instead of '!'. Especially, when >> the if clause is as complex as this and there's danger for reader to >> miss the '!' sign. >>> Petri Savolainen(psavol) wrote: >>> It's there to ensure that all bits are zero also when someone would >>> modify the bitfield from two to three fields later on. Similarly to >>> memset() zero is used for struct inits. Petri Savolainen(psavol) wrote: There's no need for abs(). Since it's all uint32_t variables, wrap a round is handled already. An example in 8bits: 0xff - 0xfd = 0x02 0x00 - 0xfe = 0x02 0x01 - 0xff = 0x02 0x02 - 0x00 = 0x02 This passes both gcc and clang, and is used already in the other ring implementation see ring_deq_multi(). > Petri Savolainen(psavol) wrote: > I prefer style with blank line in the end of a typedef, since > it's easier to spot the type name (as it's not mixed into struct > field names). Checkpatch passes so this style should be OK. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Does this mean that sizes larger than 32 have no added >> performance benefit? >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Must use `CONFIG_QUEUE_SIZE - 1` here, as noted earlier, if >>> we're not going to use the user-supplied queue size. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Given its name, this looks like an extraneous statement that should be deleted. Renaming this to something like `prefetch_dequeued_bufs()` would make the intent clearer here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > `if (!ring_st_is_empty(>s.ring_st))` seems more > natural here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Change to `if (param->size >= CONFIG_QUEUE_SIZE)` to handle >> the effective queue capacity. The user-supplied `size` >> should then be set to `ROUNDUP_POWER2_U32(size) - 1` for the
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Petri Savolainen(psavol) replied on github web page: platform/linux-generic/odp_queue.c @@ -471,51 +476,18 @@ static inline int deq_multi(queue_entry_t *queue, odp_buffer_hdr_t *buf_hdr[], } UNLOCK(>s.lock); - return 0; - } - - for (i = 0; i < num && hdr; ) { - int burst_num = hdr->burst_num; - int first = hdr->burst_first; - /* First, get bursted buffers */ - for (j = 0; j < burst_num && i < num; j++, i++) { - buf_hdr[i] = hdr->burst[first + j]; - odp_prefetch(buf_hdr[i]); - } - - if (burst_num) { - hdr->burst_num = burst_num - j; - hdr->burst_first = first + j; - } - - if (i == num) - break; - - /* When burst is empty, consume the current buffer header and -* move to the next header */ - buf_hdr[i] = hdr; - next = hdr->next; - hdr->next = NULL; - hdr= next; - updated++; - i++; + return 0; } - /* Write head only if updated */ - if (updated) - queue->s.head = hdr; - - /* Queue is empty */ - if (hdr == NULL) - queue->s.tail = NULL; - if (status_sync && queue->s.type == ODP_QUEUE_TYPE_SCHED) sched_fn->save_context(queue->s.index); UNLOCK(>s.lock); - return i; + buffer_index_to_buf(buf_hdr, buf_idx, num_deq); Comment: This function converts 32 bit buffer indexes to buffer header pointers. The counter operation is buffer_index_from_buf(). The prefetch is a side effect of the function, which may be changed/moved any time if it's found out that there's a place for prefetching. I actually plan to test if number of prefetches should be limited as e.g. 32 consecutive prefetches may be too much for some CPU architectures. > Petri Savolainen(psavol) wrote: > I prefer style where '== 0' is used instead of '!'. Especially, when the if > clause is as complex as this and there's danger for reader to miss the '!' > sign. >> Petri Savolainen(psavol) wrote: >> It's there to ensure that all bits are zero also when someone would modify >> the bitfield from two to three fields later on. Similarly to memset() zero >> is used for struct inits. >>> Petri Savolainen(psavol) wrote: >>> There's no need for abs(). Since it's all uint32_t variables, wrap a round >>> is handled already. >>> An example in 8bits: >>> 0xff - 0xfd = 0x02 >>> 0x00 - 0xfe = 0x02 >>> 0x01 - 0xff = 0x02 >>> 0x02 - 0x00 = 0x02 >>> >>> This passes both gcc and clang, and is used already in the other ring >>> implementation see ring_deq_multi(). Petri Savolainen(psavol) wrote: I prefer style with blank line in the end of a typedef, since it's easier to spot the type name (as it's not mixed into struct field names). Checkpatch passes so this style should be OK. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Does this mean that sizes larger than 32 have no added performance > benefit? >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Must use `CONFIG_QUEUE_SIZE - 1` here, as noted earlier, if we're not >> going to use the user-supplied queue size. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Given its name, this looks like an extraneous statement that should be >>> deleted. Renaming this to something like `prefetch_dequeued_bufs()` >>> would make the intent clearer here. Bill Fischofer(Bill-Fischofer-Linaro) wrote: `if (!ring_st_is_empty(>s.ring_st))` seems more natural here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Change to `if (param->size >= CONFIG_QUEUE_SIZE)` to handle the > effective queue capacity. The user-supplied `size` should then be set > to `ROUNDUP_POWER2_U32(size) - 1` for the masking to work properly. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Same comment here as for plain queues. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> As noted earlier, due to "losing" one entry to distinguish queue >>> empty/full, this should be returned as `CONFIG_QUEUE_SIZE - 1`, and >>> we also need to ensure that `CONFIG_QUEUE_SIZE` is itself a power >>> of 2. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Since you're initializing `index.pool` and `index.buffer` there's no need to set `index.u32` here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > We originally had this index partitioning based on > `ODP_CONFIG_POOLS`. Do we want to return to that here? > > If not, we at least need an
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/include/odp_ring_st_internal.h line 32 @@ -0,0 +1,118 @@ +/* Copyright (c) 2018, Linaro Limited + * All rights reserved. + * + * SPDX-License-Identifier: BSD-3-Clause + */ + +#ifndef ODP_RING_ST_INTERNAL_H_ +#define ODP_RING_ST_INTERNAL_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +/* Basic ring for single thread usage. Operations must be synchronized by using + * locks (or other means), when multiple threads use the same ring. */ +typedef struct { + uint32_t head; + uint32_t tail; + uint32_t mask; + uint32_t *data; + +} ring_st_t; + +/* Initialize ring. Ring size must be a power of two. */ +static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t size) +{ + ring->head = 0; + ring->tail = 0; + ring->mask = size - 1; Comment: This only works if `size` is a power of 2. Should be documented as such, since this is an internal routine. In this case an `ODP_ASSERT(size == ROUNDUP_POWER2_U32(size))` for this requirement would be a useful debugging aid. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > should be `num = abs(tail - head);` to deal with wrap arounds, otherwise may > be misinterpreted as overly large since it's `uint32_t`. Note that GCC and > clang recognize `abs()` and treat it as a builtin, so there's no actual > `stdlib.h` call here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Extra blank line should be removed (nit). https://github.com/Linaro/odp/pull/492#discussion_r169793906 updated_at 2018-02-22 09:32:40
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/odp_queue.c line 95 @@ -143,8 +150,10 @@ static int queue_capability(odp_queue_capability_t *capa) capa->max_sched_groups = sched_fn->num_grps(); capa->sched_prios = odp_schedule_num_prio(); capa->plain.max_num = capa->max_queues; + capa->plain.max_size= CONFIG_QUEUE_SIZE; capa->plain.nonblocking = ODP_BLOCKING; capa->sched.max_num = capa->max_queues; + capa->sched.max_size= CONFIG_QUEUE_SIZE; Comment: Same comment here as for plain queues. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > As noted earlier, due to "losing" one entry to distinguish queue empty/full, > this should be returned as `CONFIG_QUEUE_SIZE - 1`, and we also need to > ensure that `CONFIG_QUEUE_SIZE` is itself a power of 2. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Since you're initializing `index.pool` and `index.buffer` there's no need to >> set `index.u32` here. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> We originally had this index partitioning based on `ODP_CONFIG_POOLS`. Do >>> we want to return to that here? >>> >>> If not, we at least need an `ODP_STATIC_ASSERT()` to ensure that >>> `ODP_CONFIG_POOLS < 256` or else bad things will happen here. Bill Fischofer(Bill-Fischofer-Linaro) wrote: This routine can be optimized to: ``` return ring->head == ring->tail; ``` > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Your invariant is the queue is empty when `head == tail` therefore the > queue is full when `abs(tail - head) == mask`, so the correct calculation > here is: > > `num = mask - abs(tail - head);` > > The effect is that a queue can only hold `size - 1` elements, otherwise > you cannot distinguish between a full and an empty queue without another > bit of metadata, which is a cost you're trying to avoid. > > This is somewhat problematic if the caller is trying to be "optimal" by > specifying a power of two in the `size` parameter of the > `odp_queue_param_t` passed to `odp_queue_create()`. For this reason we > may wish to return a `max_size` of a power of 2 - 1 in > `odp_queue_capability()` as part of this patch series. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> This only works if `size` is a power of 2. Should be documented as such, >> since this is an internal routine. In this case an `ODP_ASSERT(size == >> ROUNDUP_POWER2_U32(size))` for this requirement would be a useful >> debugging aid. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> should be `num = abs(tail - head);` to deal with wrap arounds, >>> otherwise may be misinterpreted as overly large since it's `uint32_t`. >>> Note that GCC and clang recognize `abs()` and treat it as a builtin, so >>> there's no actual `stdlib.h` call here. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Extra blank line should be removed (nit). https://github.com/Linaro/odp/pull/492#discussion_r169821967 updated_at 2018-02-22 09:32:40
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Petri Savolainen(psavol) replied on github web page: platform/linux-generic/odp_queue.c line 103 @@ -192,6 +201,9 @@ static odp_queue_t queue_create(const char *name, param = _param; } + if (param->size > CONFIG_QUEUE_SIZE) Comment: One entry is not lost. User provided size if not (currently) used. Queue size is always 4k. > Petri Savolainen(psavol) wrote: > One entry is not lost. >> Petri Savolainen(psavol) wrote: >> One entry is not lost. >>> Petri Savolainen(psavol) wrote: >>> OK, added checks in v2. Petri Savolainen(psavol) wrote: OK. Compiler probably did that already, but changed in v2. > Petri Savolainen(psavol) wrote: > Tail and head indexes are (masked from) uint32_t and do not wrap around > when the ring is full. I think you assume that the store index is > 0...size-1, while it's full uint32_t which is then masked to get the > actual index. > > For example: > size = 100; > > Empty: > head = 100 > tail = 100 > num = 100 - 100 = 0 > > Full: > head = 100 > tail = 200 > num = 200 - 100 = 100 > > Wrap uint32_t + full: > head = 0xFF9C > tail = 0 > num = 0 - 0xFF9C = 0x64 = 100 > > So, no abs() needed. Ring size can be 4096, instead of 4095. >> Petri Savolainen(psavol) wrote: >> It's already documented 5 lines above: >> >> /* Initialize ring. Ring size must be a power of two. */ >> static inline void ring_st_init(ring_st_t *ring, uint32_t *data, >> uint32_t size) >> { >>> Petri Savolainen(psavol) wrote: >>> This function converts 32 bit buffer indexes to buffer header pointers. >>> The counter operation is buffer_index_from_buf(). The prefetch is a >>> side effect of the function, which may be changed/moved any time if >>> it's found out that there's a place for prefetching. I actually plan to >>> test if number of prefetches should be limited as e.g. 32 consecutive >>> prefetches may be too much for some CPU architectures. Petri Savolainen(psavol) wrote: I prefer style where '== 0' is used instead of '!'. Especially, when the if clause is as complex as this and there's danger for reader to miss the '!' sign. > Petri Savolainen(psavol) wrote: > It's there to ensure that all bits are zero also when someone would > modify the bitfield from two to three fields later on. Similarly to > memset() zero is used for struct inits. >> Petri Savolainen(psavol) wrote: >> There's no need for abs(). Since it's all uint32_t variables, wrap a >> round is handled already. >> An example in 8bits: >> 0xff - 0xfd = 0x02 >> 0x00 - 0xfe = 0x02 >> 0x01 - 0xff = 0x02 >> 0x02 - 0x00 = 0x02 >> >> This passes both gcc and clang, and is used already in the other >> ring implementation see ring_deq_multi(). >>> Petri Savolainen(psavol) wrote: >>> I prefer style with blank line in the end of a typedef, since it's >>> easier to spot the type name (as it's not mixed into struct field >>> names). Checkpatch passes so this style should be OK. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Does this mean that sizes larger than 32 have no added performance benefit? > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Must use `CONFIG_QUEUE_SIZE - 1` here, as noted earlier, if we're > not going to use the user-supplied queue size. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Given its name, this looks like an extraneous statement that >> should be deleted. Renaming this to something like >> `prefetch_dequeued_bufs()` would make the intent clearer here. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> `if (!ring_st_is_empty(>s.ring_st))` seems more natural >>> here. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Change to `if (param->size >= CONFIG_QUEUE_SIZE)` to handle the effective queue capacity. The user-supplied `size` should then be set to `ROUNDUP_POWER2_U32(size) - 1` for the masking to work properly. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Same comment here as for plain queues. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> As noted earlier, due to "losing" one entry to distinguish >> queue empty/full, this should be returned as >> `CONFIG_QUEUE_SIZE - 1`, and we also need to ensure that >> `CONFIG_QUEUE_SIZE` is itself a power of 2. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Petri Savolainen(psavol) replied on github web page: platform/linux-generic/odp_queue.c line 95 @@ -143,8 +150,10 @@ static int queue_capability(odp_queue_capability_t *capa) capa->max_sched_groups = sched_fn->num_grps(); capa->sched_prios = odp_schedule_num_prio(); capa->plain.max_num = capa->max_queues; + capa->plain.max_size= CONFIG_QUEUE_SIZE; capa->plain.nonblocking = ODP_BLOCKING; capa->sched.max_num = capa->max_queues; + capa->sched.max_size= CONFIG_QUEUE_SIZE; Comment: One entry is not lost. > Petri Savolainen(psavol) wrote: > One entry is not lost. >> Petri Savolainen(psavol) wrote: >> OK, added checks in v2. >>> Petri Savolainen(psavol) wrote: >>> OK. Compiler probably did that already, but changed in v2. Petri Savolainen(psavol) wrote: Tail and head indexes are (masked from) uint32_t and do not wrap around when the ring is full. I think you assume that the store index is 0...size-1, while it's full uint32_t which is then masked to get the actual index. For example: size = 100; Empty: head = 100 tail = 100 num = 100 - 100 = 0 Full: head = 100 tail = 200 num = 200 - 100 = 100 Wrap uint32_t + full: head = 0xFF9C tail = 0 num = 0 - 0xFF9C = 0x64 = 100 So, no abs() needed. Ring size can be 4096, instead of 4095. > Petri Savolainen(psavol) wrote: > It's already documented 5 lines above: > > /* Initialize ring. Ring size must be a power of two. */ > static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t > size) > { >> Petri Savolainen(psavol) wrote: >> This function converts 32 bit buffer indexes to buffer header pointers. >> The counter operation is buffer_index_from_buf(). The prefetch is a side >> effect of the function, which may be changed/moved any time if it's >> found out that there's a place for prefetching. I actually plan to test >> if number of prefetches should be limited as e.g. 32 consecutive >> prefetches may be too much for some CPU architectures. >>> Petri Savolainen(psavol) wrote: >>> I prefer style where '== 0' is used instead of '!'. Especially, when >>> the if clause is as complex as this and there's danger for reader to >>> miss the '!' sign. Petri Savolainen(psavol) wrote: It's there to ensure that all bits are zero also when someone would modify the bitfield from two to three fields later on. Similarly to memset() zero is used for struct inits. > Petri Savolainen(psavol) wrote: > There's no need for abs(). Since it's all uint32_t variables, wrap a > round is handled already. > An example in 8bits: > 0xff - 0xfd = 0x02 > 0x00 - 0xfe = 0x02 > 0x01 - 0xff = 0x02 > 0x02 - 0x00 = 0x02 > > This passes both gcc and clang, and is used already in the other ring > implementation see ring_deq_multi(). >> Petri Savolainen(psavol) wrote: >> I prefer style with blank line in the end of a typedef, since it's >> easier to spot the type name (as it's not mixed into struct field >> names). Checkpatch passes so this style should be OK. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Does this mean that sizes larger than 32 have no added performance >>> benefit? Bill Fischofer(Bill-Fischofer-Linaro) wrote: Must use `CONFIG_QUEUE_SIZE - 1` here, as noted earlier, if we're not going to use the user-supplied queue size. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Given its name, this looks like an extraneous statement that > should be deleted. Renaming this to something like > `prefetch_dequeued_bufs()` would make the intent clearer here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> `if (!ring_st_is_empty(>s.ring_st))` seems more natural >> here. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Change to `if (param->size >= CONFIG_QUEUE_SIZE)` to handle the >>> effective queue capacity. The user-supplied `size` should then >>> be set to `ROUNDUP_POWER2_U32(size) - 1` for the masking to >>> work properly. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Same comment here as for plain queues. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > As noted earlier, due to "losing" one entry to distinguish > queue empty/full, this should be returned as > `CONFIG_QUEUE_SIZE - 1`, and we also need to ensure that > `CONFIG_QUEUE_SIZE` is itself a power of 2.
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Petri Savolainen(psavol) replied on github web page: platform/linux-generic/odp_queue.c line 92 @@ -143,8 +150,10 @@ static int queue_capability(odp_queue_capability_t *capa) capa->max_sched_groups = sched_fn->num_grps(); capa->sched_prios = odp_schedule_num_prio(); capa->plain.max_num = capa->max_queues; + capa->plain.max_size= CONFIG_QUEUE_SIZE; Comment: One entry is not lost. > Petri Savolainen(psavol) wrote: > OK, added checks in v2. >> Petri Savolainen(psavol) wrote: >> OK. Compiler probably did that already, but changed in v2. >>> Petri Savolainen(psavol) wrote: >>> Tail and head indexes are (masked from) uint32_t and do not wrap around >>> when the ring is full. I think you assume that the store index is >>> 0...size-1, while it's full uint32_t which is then masked to get the actual >>> index. >>> >>> For example: >>> size = 100; >>> >>> Empty: >>> head = 100 >>> tail = 100 >>> num = 100 - 100 = 0 >>> >>> Full: >>> head = 100 >>> tail = 200 >>> num = 200 - 100 = 100 >>> >>> Wrap uint32_t + full: >>> head = 0xFF9C >>> tail = 0 >>> num = 0 - 0xFF9C = 0x64 = 100 >>> >>> So, no abs() needed. Ring size can be 4096, instead of 4095. Petri Savolainen(psavol) wrote: It's already documented 5 lines above: /* Initialize ring. Ring size must be a power of two. */ static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t size) { > Petri Savolainen(psavol) wrote: > This function converts 32 bit buffer indexes to buffer header pointers. > The counter operation is buffer_index_from_buf(). The prefetch is a side > effect of the function, which may be changed/moved any time if it's found > out that there's a place for prefetching. I actually plan to test if > number of prefetches should be limited as e.g. 32 consecutive prefetches > may be too much for some CPU architectures. >> Petri Savolainen(psavol) wrote: >> I prefer style where '== 0' is used instead of '!'. Especially, when the >> if clause is as complex as this and there's danger for reader to miss >> the '!' sign. >>> Petri Savolainen(psavol) wrote: >>> It's there to ensure that all bits are zero also when someone would >>> modify the bitfield from two to three fields later on. Similarly to >>> memset() zero is used for struct inits. Petri Savolainen(psavol) wrote: There's no need for abs(). Since it's all uint32_t variables, wrap a round is handled already. An example in 8bits: 0xff - 0xfd = 0x02 0x00 - 0xfe = 0x02 0x01 - 0xff = 0x02 0x02 - 0x00 = 0x02 This passes both gcc and clang, and is used already in the other ring implementation see ring_deq_multi(). > Petri Savolainen(psavol) wrote: > I prefer style with blank line in the end of a typedef, since it's > easier to spot the type name (as it's not mixed into struct field > names). Checkpatch passes so this style should be OK. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Does this mean that sizes larger than 32 have no added performance >> benefit? >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Must use `CONFIG_QUEUE_SIZE - 1` here, as noted earlier, if we're >>> not going to use the user-supplied queue size. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Given its name, this looks like an extraneous statement that should be deleted. Renaming this to something like `prefetch_dequeued_bufs()` would make the intent clearer here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > `if (!ring_st_is_empty(>s.ring_st))` seems more natural > here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Change to `if (param->size >= CONFIG_QUEUE_SIZE)` to handle the >> effective queue capacity. The user-supplied `size` should then >> be set to `ROUNDUP_POWER2_U32(size) - 1` for the masking to work >> properly. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Same comment here as for plain queues. Bill Fischofer(Bill-Fischofer-Linaro) wrote: As noted earlier, due to "losing" one entry to distinguish queue empty/full, this should be returned as `CONFIG_QUEUE_SIZE - 1`, and we also need to ensure that `CONFIG_QUEUE_SIZE` is itself a power of 2. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Since you're initializing `index.pool` and `index.buffer` > there's no need to set `index.u32` here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> We originally had this
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Petri Savolainen(psavol) replied on github web page: platform/linux-generic/include/odp_buffer_internal.h line 17 @@ -41,11 +41,19 @@ typedef struct seg_entry_t { uint32_t len; } seg_entry_t; +typedef union buffer_index_t { + uint32_t u32; + + struct { + uint32_t pool :8; + uint32_t buffer :24; Comment: OK, added checks in v2. > Petri Savolainen(psavol) wrote: > OK. Compiler probably did that already, but changed in v2. >> Petri Savolainen(psavol) wrote: >> Tail and head indexes are (masked from) uint32_t and do not wrap around when >> the ring is full. I think you assume that the store index is 0...size-1, >> while it's full uint32_t which is then masked to get the actual index. >> >> For example: >> size = 100; >> >> Empty: >> head = 100 >> tail = 100 >> num = 100 - 100 = 0 >> >> Full: >> head = 100 >> tail = 200 >> num = 200 - 100 = 100 >> >> Wrap uint32_t + full: >> head = 0xFF9C >> tail = 0 >> num = 0 - 0xFF9C = 0x64 = 100 >> >> So, no abs() needed. Ring size can be 4096, instead of 4095. >>> Petri Savolainen(psavol) wrote: >>> It's already documented 5 lines above: >>> >>> /* Initialize ring. Ring size must be a power of two. */ >>> static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t >>> size) >>> { Petri Savolainen(psavol) wrote: This function converts 32 bit buffer indexes to buffer header pointers. The counter operation is buffer_index_from_buf(). The prefetch is a side effect of the function, which may be changed/moved any time if it's found out that there's a place for prefetching. I actually plan to test if number of prefetches should be limited as e.g. 32 consecutive prefetches may be too much for some CPU architectures. > Petri Savolainen(psavol) wrote: > I prefer style where '== 0' is used instead of '!'. Especially, when the > if clause is as complex as this and there's danger for reader to miss the > '!' sign. >> Petri Savolainen(psavol) wrote: >> It's there to ensure that all bits are zero also when someone would >> modify the bitfield from two to three fields later on. Similarly to >> memset() zero is used for struct inits. >>> Petri Savolainen(psavol) wrote: >>> There's no need for abs(). Since it's all uint32_t variables, wrap a >>> round is handled already. >>> An example in 8bits: >>> 0xff - 0xfd = 0x02 >>> 0x00 - 0xfe = 0x02 >>> 0x01 - 0xff = 0x02 >>> 0x02 - 0x00 = 0x02 >>> >>> This passes both gcc and clang, and is used already in the other ring >>> implementation see ring_deq_multi(). Petri Savolainen(psavol) wrote: I prefer style with blank line in the end of a typedef, since it's easier to spot the type name (as it's not mixed into struct field names). Checkpatch passes so this style should be OK. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Does this mean that sizes larger than 32 have no added performance > benefit? >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Must use `CONFIG_QUEUE_SIZE - 1` here, as noted earlier, if we're >> not going to use the user-supplied queue size. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Given its name, this looks like an extraneous statement that should >>> be deleted. Renaming this to something like >>> `prefetch_dequeued_bufs()` would make the intent clearer here. Bill Fischofer(Bill-Fischofer-Linaro) wrote: `if (!ring_st_is_empty(>s.ring_st))` seems more natural here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Change to `if (param->size >= CONFIG_QUEUE_SIZE)` to handle the > effective queue capacity. The user-supplied `size` should then be > set to `ROUNDUP_POWER2_U32(size) - 1` for the masking to work > properly. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Same comment here as for plain queues. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> As noted earlier, due to "losing" one entry to distinguish >>> queue empty/full, this should be returned as `CONFIG_QUEUE_SIZE >>> - 1`, and we also need to ensure that `CONFIG_QUEUE_SIZE` is >>> itself a power of 2. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Since you're initializing `index.pool` and `index.buffer` there's no need to set `index.u32` here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > We originally had this index partitioning based on > `ODP_CONFIG_POOLS`. Do we want to return to that here? > > If not, we at least need an `ODP_STATIC_ASSERT()` to ensure
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Petri Savolainen(psavol) replied on github web page: platform/linux-generic/include/odp_ring_st_internal.h @@ -0,0 +1,118 @@ +/* Copyright (c) 2018, Linaro Limited + * All rights reserved. + * + * SPDX-License-Identifier: BSD-3-Clause + */ + +#ifndef ODP_RING_ST_INTERNAL_H_ +#define ODP_RING_ST_INTERNAL_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +/* Basic ring for single thread usage. Operations must be synchronized by using + * locks (or other means), when multiple threads use the same ring. */ +typedef struct { + uint32_t head; + uint32_t tail; + uint32_t mask; + uint32_t *data; + +} ring_st_t; + +/* Initialize ring. Ring size must be a power of two. */ +static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t size) +{ + ring->head = 0; + ring->tail = 0; + ring->mask = size - 1; + ring->data = data; +} + +/* Dequeue data from the ring head. Max_num is smaller than ring size.*/ +static inline uint32_t ring_st_deq_multi(ring_st_t *ring, uint32_t data[], +uint32_t max_num) +{ + uint32_t head, tail, mask, idx; + uint32_t num, i; + + head = ring->head; + tail = ring->tail; + mask = ring->mask; + num = tail - head; + + /* Empty */ + if (num == 0) + return 0; + + if (num > max_num) + num = max_num; + + idx = head & mask; + + for (i = 0; i < num; i++) { + data[i] = ring->data[idx]; + idx = (idx + 1) & mask; + } + + ring->head = head + num; + + return num; +} + +/* Enqueue data into the ring tail. Num_data is smaller than ring size. */ +static inline uint32_t ring_st_enq_multi(ring_st_t *ring, const uint32_t data[], +uint32_t num_data) +{ + uint32_t head, tail, mask, size, idx; + uint32_t num, i; + + head = ring->head; + tail = ring->tail; + mask = ring->mask; + size = mask + 1; + num = size - (tail - head); + + /* Full */ + if (num == 0) + return 0; + + if (num > num_data) + num = num_data; + + idx = tail & mask; + + for (i = 0; i < num; i++) { + ring->data[idx] = data[i]; + idx = (idx + 1) & mask; + } + + ring->tail = tail + num; + + return num; +} + +/* Check if ring is empty */ +static inline int ring_st_is_empty(ring_st_t *ring) +{ + uint32_t head, tail, num; + + head = ring->head; + tail = ring->tail; + num = tail - head; + + if (num == 0) + return 1; + + return 0; Comment: OK. Compiler probably did that already, but changed in v2. > Petri Savolainen(psavol) wrote: > Tail and head indexes are (masked from) uint32_t and do not wrap around when > the ring is full. I think you assume that the store index is 0...size-1, > while it's full uint32_t which is then masked to get the actual index. > > For example: > size = 100; > > Empty: > head = 100 > tail = 100 > num = 100 - 100 = 0 > > Full: > head = 100 > tail = 200 > num = 200 - 100 = 100 > > Wrap uint32_t + full: > head = 0xFF9C > tail = 0 > num = 0 - 0xFF9C = 0x64 = 100 > > So, no abs() needed. Ring size can be 4096, instead of 4095. >> Petri Savolainen(psavol) wrote: >> It's already documented 5 lines above: >> >> /* Initialize ring. Ring size must be a power of two. */ >> static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t >> size) >> { >>> Petri Savolainen(psavol) wrote: >>> This function converts 32 bit buffer indexes to buffer header pointers. The >>> counter operation is buffer_index_from_buf(). The prefetch is a side effect >>> of the function, which may be changed/moved any time if it's found out that >>> there's a place for prefetching. I actually plan to test if number of >>> prefetches should be limited as e.g. 32 consecutive prefetches may be too >>> much for some CPU architectures. Petri Savolainen(psavol) wrote: I prefer style where '== 0' is used instead of '!'. Especially, when the if clause is as complex as this and there's danger for reader to miss the '!' sign. > Petri Savolainen(psavol) wrote: > It's there to ensure that all bits are zero also when someone would > modify the bitfield from two to three fields later on. Similarly to > memset() zero is used for struct inits. >> Petri Savolainen(psavol) wrote: >> There's no need for abs(). Since it's all uint32_t variables, wrap a >> round is handled already. >> An example in 8bits: >> 0xff - 0xfd = 0x02 >> 0x00 - 0xfe = 0x02 >> 0x01 - 0xff = 0x02 >> 0x02 - 0x00 = 0x02 >> >> This passes both gcc and clang, and is used already in the other ring >> implementation see ring_deq_multi(). >>> Petri Savolainen(psavol) wrote: >>>
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/odp_queue.c line 103 @@ -192,6 +201,9 @@ static odp_queue_t queue_create(const char *name, param = _param; } + if (param->size > CONFIG_QUEUE_SIZE) Comment: Change to `if (param->size >= CONFIG_QUEUE_SIZE)` to handle the effective queue capacity. The user-supplied `size` should then be set to `ROUNDUP_POWER2_U32(size) - 1` for the masking to work properly. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Same comment here as for plain queues. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> As noted earlier, due to "losing" one entry to distinguish queue empty/full, >> this should be returned as `CONFIG_QUEUE_SIZE - 1`, and we also need to >> ensure that `CONFIG_QUEUE_SIZE` is itself a power of 2. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Since you're initializing `index.pool` and `index.buffer` there's no need >>> to set `index.u32` here. Bill Fischofer(Bill-Fischofer-Linaro) wrote: We originally had this index partitioning based on `ODP_CONFIG_POOLS`. Do we want to return to that here? If not, we at least need an `ODP_STATIC_ASSERT()` to ensure that `ODP_CONFIG_POOLS < 256` or else bad things will happen here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > This routine can be optimized to: > ``` > return ring->head == ring->tail; > ``` >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Your invariant is the queue is empty when `head == tail` therefore the >> queue is full when `abs(tail - head) == mask`, so the correct >> calculation here is: >> >> `num = mask - abs(tail - head);` >> >> The effect is that a queue can only hold `size - 1` elements, otherwise >> you cannot distinguish between a full and an empty queue without another >> bit of metadata, which is a cost you're trying to avoid. >> >> This is somewhat problematic if the caller is trying to be "optimal" by >> specifying a power of two in the `size` parameter of the >> `odp_queue_param_t` passed to `odp_queue_create()`. For this reason we >> may wish to return a `max_size` of a power of 2 - 1 in >> `odp_queue_capability()` as part of this patch series. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> This only works if `size` is a power of 2. Should be documented as >>> such, since this is an internal routine. In this case an >>> `ODP_ASSERT(size == ROUNDUP_POWER2_U32(size))` for this requirement >>> would be a useful debugging aid. Bill Fischofer(Bill-Fischofer-Linaro) wrote: should be `num = abs(tail - head);` to deal with wrap arounds, otherwise may be misinterpreted as overly large since it's `uint32_t`. Note that GCC and clang recognize `abs()` and treat it as a builtin, so there's no actual `stdlib.h` call here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Extra blank line should be removed (nit). https://github.com/Linaro/odp/pull/492#discussion_r169822191 updated_at 2018-02-22 09:32:40
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Petri Savolainen(psavol) replied on github web page: platform/linux-generic/include/odp_ring_st_internal.h line 78 @@ -0,0 +1,118 @@ +/* Copyright (c) 2018, Linaro Limited + * All rights reserved. + * + * SPDX-License-Identifier: BSD-3-Clause + */ + +#ifndef ODP_RING_ST_INTERNAL_H_ +#define ODP_RING_ST_INTERNAL_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +/* Basic ring for single thread usage. Operations must be synchronized by using + * locks (or other means), when multiple threads use the same ring. */ +typedef struct { + uint32_t head; + uint32_t tail; + uint32_t mask; + uint32_t *data; + +} ring_st_t; + +/* Initialize ring. Ring size must be a power of two. */ +static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t size) +{ + ring->head = 0; + ring->tail = 0; + ring->mask = size - 1; + ring->data = data; +} + +/* Dequeue data from the ring head. Max_num is smaller than ring size.*/ +static inline uint32_t ring_st_deq_multi(ring_st_t *ring, uint32_t data[], +uint32_t max_num) +{ + uint32_t head, tail, mask, idx; + uint32_t num, i; + + head = ring->head; + tail = ring->tail; + mask = ring->mask; + num = tail - head; + + /* Empty */ + if (num == 0) + return 0; + + if (num > max_num) + num = max_num; + + idx = head & mask; + + for (i = 0; i < num; i++) { + data[i] = ring->data[idx]; + idx = (idx + 1) & mask; + } + + ring->head = head + num; + + return num; +} + +/* Enqueue data into the ring tail. Num_data is smaller than ring size. */ +static inline uint32_t ring_st_enq_multi(ring_st_t *ring, const uint32_t data[], +uint32_t num_data) +{ + uint32_t head, tail, mask, size, idx; + uint32_t num, i; + + head = ring->head; + tail = ring->tail; + mask = ring->mask; + size = mask + 1; + num = size - (tail - head); Comment: Tail and head indexes are (masked from) uint32_t and do not wrap around when the ring is full. I think you assume that the store index is 0...size-1, while it's full uint32_t which is then masked to get the actual index. For example: size = 100; Empty: head = 100 tail = 100 num = 100 - 100 = 0 Full: head = 100 tail = 200 num = 200 - 100 = 100 Wrap uint32_t + full: head = 0xFF9C tail = 0 num = 0 - 0xFF9C = 0x64 = 100 So, no abs() needed. Ring size can be 4096, instead of 4095. > Petri Savolainen(psavol) wrote: > It's already documented 5 lines above: > > /* Initialize ring. Ring size must be a power of two. */ > static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t > size) > { >> Petri Savolainen(psavol) wrote: >> This function converts 32 bit buffer indexes to buffer header pointers. The >> counter operation is buffer_index_from_buf(). The prefetch is a side effect >> of the function, which may be changed/moved any time if it's found out that >> there's a place for prefetching. I actually plan to test if number of >> prefetches should be limited as e.g. 32 consecutive prefetches may be too >> much for some CPU architectures. >>> Petri Savolainen(psavol) wrote: >>> I prefer style where '== 0' is used instead of '!'. Especially, when the if >>> clause is as complex as this and there's danger for reader to miss the '!' >>> sign. Petri Savolainen(psavol) wrote: It's there to ensure that all bits are zero also when someone would modify the bitfield from two to three fields later on. Similarly to memset() zero is used for struct inits. > Petri Savolainen(psavol) wrote: > There's no need for abs(). Since it's all uint32_t variables, wrap a > round is handled already. > An example in 8bits: > 0xff - 0xfd = 0x02 > 0x00 - 0xfe = 0x02 > 0x01 - 0xff = 0x02 > 0x02 - 0x00 = 0x02 > > This passes both gcc and clang, and is used already in the other ring > implementation see ring_deq_multi(). >> Petri Savolainen(psavol) wrote: >> I prefer style with blank line in the end of a typedef, since it's >> easier to spot the type name (as it's not mixed into struct field >> names). Checkpatch passes so this style should be OK. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Does this mean that sizes larger than 32 have no added performance >>> benefit? Bill Fischofer(Bill-Fischofer-Linaro) wrote: Must use `CONFIG_QUEUE_SIZE - 1` here, as noted earlier, if we're not going to use the user-supplied queue size. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Given its name, this looks like an extraneous statement that should > be deleted. Renaming this to something like > `prefetch_dequeued_bufs()` would make the intent clearer
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Petri Savolainen(psavol) replied on github web page: platform/linux-generic/include/odp_ring_st_internal.h line 32 @@ -0,0 +1,118 @@ +/* Copyright (c) 2018, Linaro Limited + * All rights reserved. + * + * SPDX-License-Identifier: BSD-3-Clause + */ + +#ifndef ODP_RING_ST_INTERNAL_H_ +#define ODP_RING_ST_INTERNAL_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +/* Basic ring for single thread usage. Operations must be synchronized by using + * locks (or other means), when multiple threads use the same ring. */ +typedef struct { + uint32_t head; + uint32_t tail; + uint32_t mask; + uint32_t *data; + +} ring_st_t; + +/* Initialize ring. Ring size must be a power of two. */ +static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t size) +{ + ring->head = 0; + ring->tail = 0; + ring->mask = size - 1; Comment: It's already documented 5 lines above: /* Initialize ring. Ring size must be a power of two. */ static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t size) { > Petri Savolainen(psavol) wrote: > This function converts 32 bit buffer indexes to buffer header pointers. The > counter operation is buffer_index_from_buf(). The prefetch is a side effect > of the function, which may be changed/moved any time if it's found out that > there's a place for prefetching. I actually plan to test if number of > prefetches should be limited as e.g. 32 consecutive prefetches may be too > much for some CPU architectures. >> Petri Savolainen(psavol) wrote: >> I prefer style where '== 0' is used instead of '!'. Especially, when the if >> clause is as complex as this and there's danger for reader to miss the '!' >> sign. >>> Petri Savolainen(psavol) wrote: >>> It's there to ensure that all bits are zero also when someone would modify >>> the bitfield from two to three fields later on. Similarly to memset() zero >>> is used for struct inits. Petri Savolainen(psavol) wrote: There's no need for abs(). Since it's all uint32_t variables, wrap a round is handled already. An example in 8bits: 0xff - 0xfd = 0x02 0x00 - 0xfe = 0x02 0x01 - 0xff = 0x02 0x02 - 0x00 = 0x02 This passes both gcc and clang, and is used already in the other ring implementation see ring_deq_multi(). > Petri Savolainen(psavol) wrote: > I prefer style with blank line in the end of a typedef, since it's easier > to spot the type name (as it's not mixed into struct field names). > Checkpatch passes so this style should be OK. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Does this mean that sizes larger than 32 have no added performance >> benefit? >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Must use `CONFIG_QUEUE_SIZE - 1` here, as noted earlier, if we're not >>> going to use the user-supplied queue size. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Given its name, this looks like an extraneous statement that should be deleted. Renaming this to something like `prefetch_dequeued_bufs()` would make the intent clearer here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > `if (!ring_st_is_empty(>s.ring_st))` seems more natural here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Change to `if (param->size >= CONFIG_QUEUE_SIZE)` to handle the >> effective queue capacity. The user-supplied `size` should then be >> set to `ROUNDUP_POWER2_U32(size) - 1` for the masking to work >> properly. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Same comment here as for plain queues. Bill Fischofer(Bill-Fischofer-Linaro) wrote: As noted earlier, due to "losing" one entry to distinguish queue empty/full, this should be returned as `CONFIG_QUEUE_SIZE - 1`, and we also need to ensure that `CONFIG_QUEUE_SIZE` is itself a power of 2. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Since you're initializing `index.pool` and `index.buffer` there's > no need to set `index.u32` here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> We originally had this index partitioning based on >> `ODP_CONFIG_POOLS`. Do we want to return to that here? >> >> If not, we at least need an `ODP_STATIC_ASSERT()` to ensure that >> `ODP_CONFIG_POOLS < 256` or else bad things will happen here. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> This routine can be optimized to: >>> ``` >>> return ring->head == ring->tail; >>> ``` Bill Fischofer(Bill-Fischofer-Linaro) wrote: Your invariant is the queue is empty when `head == tail` therefore
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Petri Savolainen(psavol) replied on github web page: platform/linux-generic/odp_queue.c @@ -263,7 +275,7 @@ static int queue_destroy(odp_queue_t handle) ODP_ERR("queue \"%s\" already destroyed\n", queue->s.name); return -1; } - if (queue->s.head != NULL) { + if (ring_st_is_empty(>s.ring_st) == 0) { Comment: I prefer style where '== 0' is used instead of '!'. Especially, when the if clause is as complex as this and there's danger for reader to miss the '!' sign. > Petri Savolainen(psavol) wrote: > It's there to ensure that all bits are zero also when someone would modify > the bitfield from two to three fields later on. Similarly to memset() zero is > used for struct inits. >> Petri Savolainen(psavol) wrote: >> There's no need for abs(). Since it's all uint32_t variables, wrap a round >> is handled already. >> An example in 8bits: >> 0xff - 0xfd = 0x02 >> 0x00 - 0xfe = 0x02 >> 0x01 - 0xff = 0x02 >> 0x02 - 0x00 = 0x02 >> >> This passes both gcc and clang, and is used already in the other ring >> implementation see ring_deq_multi(). >>> Petri Savolainen(psavol) wrote: >>> I prefer style with blank line in the end of a typedef, since it's easier >>> to spot the type name (as it's not mixed into struct field names). >>> Checkpatch passes so this style should be OK. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Does this mean that sizes larger than 32 have no added performance benefit? > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Must use `CONFIG_QUEUE_SIZE - 1` here, as noted earlier, if we're not > going to use the user-supplied queue size. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Given its name, this looks like an extraneous statement that should be >> deleted. Renaming this to something like `prefetch_dequeued_bufs()` >> would make the intent clearer here. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> `if (!ring_st_is_empty(>s.ring_st))` seems more natural here. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Change to `if (param->size >= CONFIG_QUEUE_SIZE)` to handle the effective queue capacity. The user-supplied `size` should then be set to `ROUNDUP_POWER2_U32(size) - 1` for the masking to work properly. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Same comment here as for plain queues. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> As noted earlier, due to "losing" one entry to distinguish queue >> empty/full, this should be returned as `CONFIG_QUEUE_SIZE - 1`, and >> we also need to ensure that `CONFIG_QUEUE_SIZE` is itself a power of >> 2. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Since you're initializing `index.pool` and `index.buffer` there's >>> no need to set `index.u32` here. Bill Fischofer(Bill-Fischofer-Linaro) wrote: We originally had this index partitioning based on `ODP_CONFIG_POOLS`. Do we want to return to that here? If not, we at least need an `ODP_STATIC_ASSERT()` to ensure that `ODP_CONFIG_POOLS < 256` or else bad things will happen here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > This routine can be optimized to: > ``` > return ring->head == ring->tail; > ``` >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Your invariant is the queue is empty when `head == tail` >> therefore the queue is full when `abs(tail - head) == mask`, so >> the correct calculation here is: >> >> `num = mask - abs(tail - head);` >> >> The effect is that a queue can only hold `size - 1` elements, >> otherwise you cannot distinguish between a full and an empty >> queue without another bit of metadata, which is a cost you're >> trying to avoid. >> >> This is somewhat problematic if the caller is trying to be >> "optimal" by specifying a power of two in the `size` parameter >> of the `odp_queue_param_t` passed to `odp_queue_create()`. For >> this reason we may wish to return a `max_size` of a power of 2 - >> 1 in `odp_queue_capability()` as part of this patch series. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> This only works if `size` is a power of 2. Should be documented >>> as such, since this is an internal routine. In this case an >>> `ODP_ASSERT(size == ROUNDUP_POWER2_U32(size))` for this >>> requirement would be a useful debugging aid. Bill Fischofer(Bill-Fischofer-Linaro) wrote: should be `num = abs(tail - head);` to deal with wrap arounds,
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Petri Savolainen(psavol) replied on github web page: platform/linux-generic/include/odp_ring_st_internal.h line 24 @@ -0,0 +1,118 @@ +/* Copyright (c) 2018, Linaro Limited + * All rights reserved. + * + * SPDX-License-Identifier: BSD-3-Clause + */ + +#ifndef ODP_RING_ST_INTERNAL_H_ +#define ODP_RING_ST_INTERNAL_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +/* Basic ring for single thread usage. Operations must be synchronized by using + * locks (or other means), when multiple threads use the same ring. */ +typedef struct { + uint32_t head; + uint32_t tail; + uint32_t mask; + uint32_t *data; + Comment: I prefer style with blank line in the end of a typedef, since it's easier to spot the type name (as it's not mixed into struct field names). Checkpatch passes so this style should be OK. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Does this mean that sizes larger than 32 have no added performance benefit? >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Must use `CONFIG_QUEUE_SIZE - 1` here, as noted earlier, if we're not going >> to use the user-supplied queue size. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Given its name, this looks like an extraneous statement that should be >>> deleted. Renaming this to something like `prefetch_dequeued_bufs()` would >>> make the intent clearer here. Bill Fischofer(Bill-Fischofer-Linaro) wrote: `if (!ring_st_is_empty(>s.ring_st))` seems more natural here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Change to `if (param->size >= CONFIG_QUEUE_SIZE)` to handle the effective > queue capacity. The user-supplied `size` should then be set to > `ROUNDUP_POWER2_U32(size) - 1` for the masking to work properly. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Same comment here as for plain queues. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> As noted earlier, due to "losing" one entry to distinguish queue >>> empty/full, this should be returned as `CONFIG_QUEUE_SIZE - 1`, and we >>> also need to ensure that `CONFIG_QUEUE_SIZE` is itself a power of 2. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Since you're initializing `index.pool` and `index.buffer` there's no need to set `index.u32` here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > We originally had this index partitioning based on > `ODP_CONFIG_POOLS`. Do we want to return to that here? > > If not, we at least need an `ODP_STATIC_ASSERT()` to ensure that > `ODP_CONFIG_POOLS < 256` or else bad things will happen here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> This routine can be optimized to: >> ``` >> return ring->head == ring->tail; >> ``` >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Your invariant is the queue is empty when `head == tail` therefore >>> the queue is full when `abs(tail - head) == mask`, so the correct >>> calculation here is: >>> >>> `num = mask - abs(tail - head);` >>> >>> The effect is that a queue can only hold `size - 1` elements, >>> otherwise you cannot distinguish between a full and an empty queue >>> without another bit of metadata, which is a cost you're trying to >>> avoid. >>> >>> This is somewhat problematic if the caller is trying to be >>> "optimal" by specifying a power of two in the `size` parameter of >>> the `odp_queue_param_t` passed to `odp_queue_create()`. For this >>> reason we may wish to return a `max_size` of a power of 2 - 1 in >>> `odp_queue_capability()` as part of this patch series. Bill Fischofer(Bill-Fischofer-Linaro) wrote: This only works if `size` is a power of 2. Should be documented as such, since this is an internal routine. In this case an `ODP_ASSERT(size == ROUNDUP_POWER2_U32(size))` for this requirement would be a useful debugging aid. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > should be `num = abs(tail - head);` to deal with wrap arounds, > otherwise may be misinterpreted as overly large since it's > `uint32_t`. Note that GCC and clang recognize `abs()` and treat > it as a builtin, so there's no actual `stdlib.h` call here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Extra blank line should be removed (nit). https://github.com/Linaro/odp/pull/492#discussion_r169871326 updated_at 2018-02-22 09:32:40
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Petri Savolainen(psavol) replied on github web page: platform/linux-generic/odp_pool.c line 28 @@ -296,7 +282,9 @@ static void init_buffers(pool_t *pool) memset(buf_hdr, 0, (uintptr_t)data - (uintptr_t)buf_hdr); /* Initialize buffer metadata */ - buf_hdr->index = i; + buf_hdr->index.u32= 0; + buf_hdr->index.pool = pool->pool_idx; + buf_hdr->index.buffer = i; Comment: It's there to ensure that all bits are zero also when someone would modify the bitfield from two to three fields later on. Similarly to memset() zero is used for struct inits. > Petri Savolainen(psavol) wrote: > There's no need for abs(). Since it's all uint32_t variables, wrap a round is > handled already. > An example in 8bits: > 0xff - 0xfd = 0x02 > 0x00 - 0xfe = 0x02 > 0x01 - 0xff = 0x02 > 0x02 - 0x00 = 0x02 > > This passes both gcc and clang, and is used already in the other ring > implementation see ring_deq_multi(). >> Petri Savolainen(psavol) wrote: >> I prefer style with blank line in the end of a typedef, since it's easier to >> spot the type name (as it's not mixed into struct field names). Checkpatch >> passes so this style should be OK. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Does this mean that sizes larger than 32 have no added performance benefit? Bill Fischofer(Bill-Fischofer-Linaro) wrote: Must use `CONFIG_QUEUE_SIZE - 1` here, as noted earlier, if we're not going to use the user-supplied queue size. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Given its name, this looks like an extraneous statement that should be > deleted. Renaming this to something like `prefetch_dequeued_bufs()` would > make the intent clearer here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> `if (!ring_st_is_empty(>s.ring_st))` seems more natural here. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Change to `if (param->size >= CONFIG_QUEUE_SIZE)` to handle the >>> effective queue capacity. The user-supplied `size` should then be set >>> to `ROUNDUP_POWER2_U32(size) - 1` for the masking to work properly. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Same comment here as for plain queues. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > As noted earlier, due to "losing" one entry to distinguish queue > empty/full, this should be returned as `CONFIG_QUEUE_SIZE - 1`, and > we also need to ensure that `CONFIG_QUEUE_SIZE` is itself a power of > 2. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Since you're initializing `index.pool` and `index.buffer` there's no >> need to set `index.u32` here. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> We originally had this index partitioning based on >>> `ODP_CONFIG_POOLS`. Do we want to return to that here? >>> >>> If not, we at least need an `ODP_STATIC_ASSERT()` to ensure that >>> `ODP_CONFIG_POOLS < 256` or else bad things will happen here. Bill Fischofer(Bill-Fischofer-Linaro) wrote: This routine can be optimized to: ``` return ring->head == ring->tail; ``` > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Your invariant is the queue is empty when `head == tail` > therefore the queue is full when `abs(tail - head) == mask`, so > the correct calculation here is: > > `num = mask - abs(tail - head);` > > The effect is that a queue can only hold `size - 1` elements, > otherwise you cannot distinguish between a full and an empty > queue without another bit of metadata, which is a cost you're > trying to avoid. > > This is somewhat problematic if the caller is trying to be > "optimal" by specifying a power of two in the `size` parameter of > the `odp_queue_param_t` passed to `odp_queue_create()`. For this > reason we may wish to return a `max_size` of a power of 2 - 1 in > `odp_queue_capability()` as part of this patch series. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> This only works if `size` is a power of 2. Should be documented >> as such, since this is an internal routine. In this case an >> `ODP_ASSERT(size == ROUNDUP_POWER2_U32(size))` for this >> requirement would be a useful debugging aid. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> should be `num = abs(tail - head);` to deal with wrap arounds, >>> otherwise may be misinterpreted as overly large since it's >>> `uint32_t`. Note that GCC and clang recognize `abs()` and treat >>> it as
Re: [lng-odp] [PATCH API-NEXT v2] IPsec TFC implementation
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/odp_ipsec.c line 245 @@ -1165,6 +1167,8 @@ static int ipsec_out_esp(odp_packet_t *pkt, ipsec_offset + _ODP_ESPHDR_LEN, ipsec_sa->esp_iv_len, state->iv + ipsec_sa->salt_length); + _odp_packet_set_data(*pkt, esptrl_offset - esptrl.pad_len - tfc_len, +0xa5, tfc_len); Comment: Should `0xa5` be a `#define` rather than a "magic number"? > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Nit: could use `odp_unlikely()` here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> This change requires an API change as the spec says relevant offsets must be >> in the range `0..odp_packet_len(pkt) - 1` . Same comment for the L3 and L4 >> changes in this patch. >> >> In theory the validation tests should test these bounds, but as with most >> parts of the API violations simply result in undefined behavior, so this is >> an "honor system". Still, we can't violate the spec here without changing >> the spec. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Unless it has been parsed, `odp_packet_l3_offset()` is initialized to >>> `ODP_PACKET_OFFSET_INVALID`, so this seems an undue burden. The original >>> wording seems cleaner from an application perspective. https://github.com/Linaro/odp/pull/494#discussion_r169839450 updated_at 2018-02-22 02:28:33
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/include/odp_config_internal.h line 5 @@ -144,7 +144,7 @@ extern "C" { * This controls the burst size on various enqueue, dequeue, etc calls. Large * burst size improves throughput, but may degrade QoS (increase latency). */ -#define CONFIG_BURST_SIZE 16 +#define CONFIG_BURST_SIZE 32 Comment: Does this mean that sizes larger than 32 have no added performance benefit? > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Must use `CONFIG_QUEUE_SIZE - 1` here, as noted earlier, if we're not going > to use the user-supplied queue size. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Given its name, this looks like an extraneous statement that should be >> deleted. Renaming this to something like `prefetch_dequeued_bufs()` would >> make the intent clearer here. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> `if (!ring_st_is_empty(>s.ring_st))` seems more natural here. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Change to `if (param->size >= CONFIG_QUEUE_SIZE)` to handle the effective queue capacity. The user-supplied `size` should then be set to `ROUNDUP_POWER2_U32(size) - 1` for the masking to work properly. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Same comment here as for plain queues. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> As noted earlier, due to "losing" one entry to distinguish queue >> empty/full, this should be returned as `CONFIG_QUEUE_SIZE - 1`, and we >> also need to ensure that `CONFIG_QUEUE_SIZE` is itself a power of 2. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Since you're initializing `index.pool` and `index.buffer` there's no >>> need to set `index.u32` here. Bill Fischofer(Bill-Fischofer-Linaro) wrote: We originally had this index partitioning based on `ODP_CONFIG_POOLS`. Do we want to return to that here? If not, we at least need an `ODP_STATIC_ASSERT()` to ensure that `ODP_CONFIG_POOLS < 256` or else bad things will happen here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > This routine can be optimized to: > ``` > return ring->head == ring->tail; > ``` >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Your invariant is the queue is empty when `head == tail` therefore >> the queue is full when `abs(tail - head) == mask`, so the correct >> calculation here is: >> >> `num = mask - abs(tail - head);` >> >> The effect is that a queue can only hold `size - 1` elements, >> otherwise you cannot distinguish between a full and an empty queue >> without another bit of metadata, which is a cost you're trying to >> avoid. >> >> This is somewhat problematic if the caller is trying to be "optimal" >> by specifying a power of two in the `size` parameter of the >> `odp_queue_param_t` passed to `odp_queue_create()`. For this reason >> we may wish to return a `max_size` of a power of 2 - 1 in >> `odp_queue_capability()` as part of this patch series. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> This only works if `size` is a power of 2. Should be documented as >>> such, since this is an internal routine. In this case an >>> `ODP_ASSERT(size == ROUNDUP_POWER2_U32(size))` for this requirement >>> would be a useful debugging aid. Bill Fischofer(Bill-Fischofer-Linaro) wrote: should be `num = abs(tail - head);` to deal with wrap arounds, otherwise may be misinterpreted as overly large since it's `uint32_t`. Note that GCC and clang recognize `abs()` and treat it as a builtin, so there's no actual `stdlib.h` call here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Extra blank line should be removed (nit). https://github.com/Linaro/odp/pull/492#discussion_r169825596 updated_at 2018-02-22 09:32:40
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/odp_queue.c line 420 @@ -584,8 +556,9 @@ static int queue_init(queue_entry_t *queue, const char *name, queue->s.pktin = PKTIN_INVALID; queue->s.pktout = PKTOUT_INVALID; - queue->s.head = NULL; - queue->s.tail = NULL; + ring_st_init(>s.ring_st, +queue_tbl->ring_data[queue->s.index].data, +CONFIG_QUEUE_SIZE); Comment: Must use `CONFIG_QUEUE_SIZE - 1` here, as noted earlier, if we're not going to use the user-supplied queue size. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Given its name, this looks like an extraneous statement that should be > deleted. Renaming this to something like `prefetch_dequeued_bufs()` would > make the intent clearer here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> `if (!ring_st_is_empty(>s.ring_st))` seems more natural here. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Change to `if (param->size >= CONFIG_QUEUE_SIZE)` to handle the effective >>> queue capacity. The user-supplied `size` should then be set to >>> `ROUNDUP_POWER2_U32(size) - 1` for the masking to work properly. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Same comment here as for plain queues. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > As noted earlier, due to "losing" one entry to distinguish queue > empty/full, this should be returned as `CONFIG_QUEUE_SIZE - 1`, and we > also need to ensure that `CONFIG_QUEUE_SIZE` is itself a power of 2. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Since you're initializing `index.pool` and `index.buffer` there's no >> need to set `index.u32` here. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> We originally had this index partitioning based on `ODP_CONFIG_POOLS`. >>> Do we want to return to that here? >>> >>> If not, we at least need an `ODP_STATIC_ASSERT()` to ensure that >>> `ODP_CONFIG_POOLS < 256` or else bad things will happen here. Bill Fischofer(Bill-Fischofer-Linaro) wrote: This routine can be optimized to: ``` return ring->head == ring->tail; ``` > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Your invariant is the queue is empty when `head == tail` therefore > the queue is full when `abs(tail - head) == mask`, so the correct > calculation here is: > > `num = mask - abs(tail - head);` > > The effect is that a queue can only hold `size - 1` elements, > otherwise you cannot distinguish between a full and an empty queue > without another bit of metadata, which is a cost you're trying to > avoid. > > This is somewhat problematic if the caller is trying to be "optimal" > by specifying a power of two in the `size` parameter of the > `odp_queue_param_t` passed to `odp_queue_create()`. For this reason > we may wish to return a `max_size` of a power of 2 - 1 in > `odp_queue_capability()` as part of this patch series. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> This only works if `size` is a power of 2. Should be documented as >> such, since this is an internal routine. In this case an >> `ODP_ASSERT(size == ROUNDUP_POWER2_U32(size))` for this requirement >> would be a useful debugging aid. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> should be `num = abs(tail - head);` to deal with wrap arounds, >>> otherwise may be misinterpreted as overly large since it's >>> `uint32_t`. Note that GCC and clang recognize `abs()` and treat it >>> as a builtin, so there's no actual `stdlib.h` call here. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Extra blank line should be removed (nit). https://github.com/Linaro/odp/pull/492#discussion_r169825091 updated_at 2018-02-22 09:32:40
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/odp_queue.c @@ -471,51 +476,18 @@ static inline int deq_multi(queue_entry_t *queue, odp_buffer_hdr_t *buf_hdr[], } UNLOCK(>s.lock); - return 0; - } - - for (i = 0; i < num && hdr; ) { - int burst_num = hdr->burst_num; - int first = hdr->burst_first; - /* First, get bursted buffers */ - for (j = 0; j < burst_num && i < num; j++, i++) { - buf_hdr[i] = hdr->burst[first + j]; - odp_prefetch(buf_hdr[i]); - } - - if (burst_num) { - hdr->burst_num = burst_num - j; - hdr->burst_first = first + j; - } - - if (i == num) - break; - - /* When burst is empty, consume the current buffer header and -* move to the next header */ - buf_hdr[i] = hdr; - next = hdr->next; - hdr->next = NULL; - hdr= next; - updated++; - i++; + return 0; } - /* Write head only if updated */ - if (updated) - queue->s.head = hdr; - - /* Queue is empty */ - if (hdr == NULL) - queue->s.tail = NULL; - if (status_sync && queue->s.type == ODP_QUEUE_TYPE_SCHED) sched_fn->save_context(queue->s.index); UNLOCK(>s.lock); - return i; + buffer_index_to_buf(buf_hdr, buf_idx, num_deq); Comment: Given its name, this looks like an extraneous statement that should be deleted. Renaming this to something like `prefetch_dequeued_bufs()` would make the intent clearer here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > `if (!ring_st_is_empty(>s.ring_st))` seems more natural here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Change to `if (param->size >= CONFIG_QUEUE_SIZE)` to handle the effective >> queue capacity. The user-supplied `size` should then be set to >> `ROUNDUP_POWER2_U32(size) - 1` for the masking to work properly. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Same comment here as for plain queues. Bill Fischofer(Bill-Fischofer-Linaro) wrote: As noted earlier, due to "losing" one entry to distinguish queue empty/full, this should be returned as `CONFIG_QUEUE_SIZE - 1`, and we also need to ensure that `CONFIG_QUEUE_SIZE` is itself a power of 2. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Since you're initializing `index.pool` and `index.buffer` there's no need > to set `index.u32` here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> We originally had this index partitioning based on `ODP_CONFIG_POOLS`. >> Do we want to return to that here? >> >> If not, we at least need an `ODP_STATIC_ASSERT()` to ensure that >> `ODP_CONFIG_POOLS < 256` or else bad things will happen here. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> This routine can be optimized to: >>> ``` >>> return ring->head == ring->tail; >>> ``` Bill Fischofer(Bill-Fischofer-Linaro) wrote: Your invariant is the queue is empty when `head == tail` therefore the queue is full when `abs(tail - head) == mask`, so the correct calculation here is: `num = mask - abs(tail - head);` The effect is that a queue can only hold `size - 1` elements, otherwise you cannot distinguish between a full and an empty queue without another bit of metadata, which is a cost you're trying to avoid. This is somewhat problematic if the caller is trying to be "optimal" by specifying a power of two in the `size` parameter of the `odp_queue_param_t` passed to `odp_queue_create()`. For this reason we may wish to return a `max_size` of a power of 2 - 1 in `odp_queue_capability()` as part of this patch series. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > This only works if `size` is a power of 2. Should be documented as > such, since this is an internal routine. In this case an > `ODP_ASSERT(size == ROUNDUP_POWER2_U32(size))` for this requirement > would be a useful debugging aid. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> should be `num = abs(tail - head);` to deal with wrap arounds, >> otherwise may be misinterpreted as overly large since it's >> `uint32_t`. Note that GCC and clang recognize `abs()` and treat it >> as a builtin, so there's no actual `stdlib.h` call here. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Extra blank line should be removed (nit).
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/odp_queue.c @@ -263,7 +275,7 @@ static int queue_destroy(odp_queue_t handle) ODP_ERR("queue \"%s\" already destroyed\n", queue->s.name); return -1; } - if (queue->s.head != NULL) { + if (ring_st_is_empty(>s.ring_st) == 0) { Comment: `if (!ring_st_is_empty(>s.ring_st))` seems more natural here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Change to `if (param->size >= CONFIG_QUEUE_SIZE)` to handle the effective > queue capacity. The user-supplied `size` should then be set to > `ROUNDUP_POWER2_U32(size) - 1` for the masking to work properly. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Same comment here as for plain queues. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> As noted earlier, due to "losing" one entry to distinguish queue >>> empty/full, this should be returned as `CONFIG_QUEUE_SIZE - 1`, and we also >>> need to ensure that `CONFIG_QUEUE_SIZE` is itself a power of 2. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Since you're initializing `index.pool` and `index.buffer` there's no need to set `index.u32` here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > We originally had this index partitioning based on `ODP_CONFIG_POOLS`. Do > we want to return to that here? > > If not, we at least need an `ODP_STATIC_ASSERT()` to ensure that > `ODP_CONFIG_POOLS < 256` or else bad things will happen here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> This routine can be optimized to: >> ``` >> return ring->head == ring->tail; >> ``` >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Your invariant is the queue is empty when `head == tail` therefore the >>> queue is full when `abs(tail - head) == mask`, so the correct >>> calculation here is: >>> >>> `num = mask - abs(tail - head);` >>> >>> The effect is that a queue can only hold `size - 1` elements, otherwise >>> you cannot distinguish between a full and an empty queue without >>> another bit of metadata, which is a cost you're trying to avoid. >>> >>> This is somewhat problematic if the caller is trying to be "optimal" by >>> specifying a power of two in the `size` parameter of the >>> `odp_queue_param_t` passed to `odp_queue_create()`. For this reason we >>> may wish to return a `max_size` of a power of 2 - 1 in >>> `odp_queue_capability()` as part of this patch series. Bill Fischofer(Bill-Fischofer-Linaro) wrote: This only works if `size` is a power of 2. Should be documented as such, since this is an internal routine. In this case an `ODP_ASSERT(size == ROUNDUP_POWER2_U32(size))` for this requirement would be a useful debugging aid. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > should be `num = abs(tail - head);` to deal with wrap arounds, > otherwise may be misinterpreted as overly large since it's > `uint32_t`. Note that GCC and clang recognize `abs()` and treat it as > a builtin, so there's no actual `stdlib.h` call here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Extra blank line should be removed (nit). https://github.com/Linaro/odp/pull/492#discussion_r169823060 updated_at 2018-02-22 09:32:40
Re: [lng-odp] [PATCH API-NEXT v2] IPsec TFC implementation
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/odp_packet.c line 5 @@ -1286,7 +1286,7 @@ int odp_packet_l2_offset_set(odp_packet_t pkt, uint32_t offset) { odp_packet_hdr_t *pkt_hdr = packet_hdr(pkt); - if (offset >= pkt_hdr->frame_len) + if (offset > pkt_hdr->frame_len) Comment: This change requires an API change as the spec says relevant offsets must be in the range `0..odp_packet_len(pkt) - 1` . Same comment for the L3 and L4 changes in this patch. In theory the validation tests should test these bounds, but as with most parts of the API violations simply result in undefined behavior, so this is an "honor system". Still, we can't violate the spec here without changing the spec. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Unless it has been parsed, `odp_packet_l3_offset()` is initialized to > `ODP_PACKET_OFFSET_INVALID`, so this seems an undue burden. The original > wording seems cleaner from an application perspective. https://github.com/Linaro/odp/pull/494#discussion_r169837497 updated_at 2018-02-22 02:28:33
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/include/odp_ring_st_internal.h @@ -0,0 +1,118 @@ +/* Copyright (c) 2018, Linaro Limited + * All rights reserved. + * + * SPDX-License-Identifier: BSD-3-Clause + */ + +#ifndef ODP_RING_ST_INTERNAL_H_ +#define ODP_RING_ST_INTERNAL_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +/* Basic ring for single thread usage. Operations must be synchronized by using + * locks (or other means), when multiple threads use the same ring. */ +typedef struct { + uint32_t head; + uint32_t tail; + uint32_t mask; + uint32_t *data; + +} ring_st_t; + +/* Initialize ring. Ring size must be a power of two. */ +static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t size) +{ + ring->head = 0; + ring->tail = 0; + ring->mask = size - 1; + ring->data = data; +} + +/* Dequeue data from the ring head. Max_num is smaller than ring size.*/ +static inline uint32_t ring_st_deq_multi(ring_st_t *ring, uint32_t data[], +uint32_t max_num) +{ + uint32_t head, tail, mask, idx; + uint32_t num, i; + + head = ring->head; + tail = ring->tail; + mask = ring->mask; + num = tail - head; + + /* Empty */ + if (num == 0) + return 0; + + if (num > max_num) + num = max_num; + + idx = head & mask; + + for (i = 0; i < num; i++) { + data[i] = ring->data[idx]; + idx = (idx + 1) & mask; + } + + ring->head = head + num; + + return num; +} + +/* Enqueue data into the ring tail. Num_data is smaller than ring size. */ +static inline uint32_t ring_st_enq_multi(ring_st_t *ring, const uint32_t data[], +uint32_t num_data) +{ + uint32_t head, tail, mask, size, idx; + uint32_t num, i; + + head = ring->head; + tail = ring->tail; + mask = ring->mask; + size = mask + 1; + num = size - (tail - head); + + /* Full */ + if (num == 0) + return 0; + + if (num > num_data) + num = num_data; + + idx = tail & mask; + + for (i = 0; i < num; i++) { + ring->data[idx] = data[i]; + idx = (idx + 1) & mask; + } + + ring->tail = tail + num; + + return num; +} + +/* Check if ring is empty */ +static inline int ring_st_is_empty(ring_st_t *ring) +{ + uint32_t head, tail, num; + + head = ring->head; + tail = ring->tail; + num = tail - head; + + if (num == 0) + return 1; + + return 0; Comment: This routine can be optimized to: ``` return ring->head == ring->tail; ``` > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Your invariant is the queue is empty when `head == tail` therefore the queue > is full when `abs(tail - head) == mask`, so the correct calculation here is: > > `num = mask - abs(tail - head);` > > The effect is that a queue can only hold `size - 1` elements, otherwise you > cannot distinguish between a full and an empty queue without another bit of > metadata, which is a cost you're trying to avoid. > > This is somewhat problematic if the caller is trying to be "optimal" by > specifying a power of two in the `size` parameter of the `odp_queue_param_t` > passed to `odp_queue_create()`. For this reason we may wish to return a > `max_size` of a power of 2 - 1 in `odp_queue_capability()` as part of this > patch series. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> This only works if `size` is a power of 2. Should be documented as such, >> since this is an internal routine. In this case an `ODP_ASSERT(size == >> ROUNDUP_POWER2_U32(size))` for this requirement would be a useful debugging >> aid. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> should be `num = abs(tail - head);` to deal with wrap arounds, otherwise >>> may be misinterpreted as overly large since it's `uint32_t`. Note that GCC >>> and clang recognize `abs()` and treat it as a builtin, so there's no actual >>> `stdlib.h` call here. Bill Fischofer(Bill-Fischofer-Linaro) wrote: Extra blank line should be removed (nit). https://github.com/Linaro/odp/pull/492#discussion_r169804664 updated_at 2018-02-22 09:32:40
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/odp_queue.c line 92 @@ -143,8 +150,10 @@ static int queue_capability(odp_queue_capability_t *capa) capa->max_sched_groups = sched_fn->num_grps(); capa->sched_prios = odp_schedule_num_prio(); capa->plain.max_num = capa->max_queues; + capa->plain.max_size= CONFIG_QUEUE_SIZE; Comment: As noted earlier, due to "losing" one entry to distinguish queue empty/full, this should be returned as `CONFIG_QUEUE_SIZE - 1`, and we also need to ensure that `CONFIG_QUEUE_SIZE` is itself a power of 2. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Since you're initializing `index.pool` and `index.buffer` there's no need to > set `index.u32` here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> We originally had this index partitioning based on `ODP_CONFIG_POOLS`. Do we >> want to return to that here? >> >> If not, we at least need an `ODP_STATIC_ASSERT()` to ensure that >> `ODP_CONFIG_POOLS < 256` or else bad things will happen here. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> This routine can be optimized to: >>> ``` >>> return ring->head == ring->tail; >>> ``` Bill Fischofer(Bill-Fischofer-Linaro) wrote: Your invariant is the queue is empty when `head == tail` therefore the queue is full when `abs(tail - head) == mask`, so the correct calculation here is: `num = mask - abs(tail - head);` The effect is that a queue can only hold `size - 1` elements, otherwise you cannot distinguish between a full and an empty queue without another bit of metadata, which is a cost you're trying to avoid. This is somewhat problematic if the caller is trying to be "optimal" by specifying a power of two in the `size` parameter of the `odp_queue_param_t` passed to `odp_queue_create()`. For this reason we may wish to return a `max_size` of a power of 2 - 1 in `odp_queue_capability()` as part of this patch series. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > This only works if `size` is a power of 2. Should be documented as such, > since this is an internal routine. In this case an `ODP_ASSERT(size == > ROUNDUP_POWER2_U32(size))` for this requirement would be a useful > debugging aid. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> should be `num = abs(tail - head);` to deal with wrap arounds, otherwise >> may be misinterpreted as overly large since it's `uint32_t`. Note that >> GCC and clang recognize `abs()` and treat it as a builtin, so there's no >> actual `stdlib.h` call here. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Extra blank line should be removed (nit). https://github.com/Linaro/odp/pull/492#discussion_r169821543 updated_at 2018-02-22 09:32:40
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/include/odp_buffer_internal.h line 17 @@ -41,11 +41,19 @@ typedef struct seg_entry_t { uint32_t len; } seg_entry_t; +typedef union buffer_index_t { + uint32_t u32; + + struct { + uint32_t pool :8; + uint32_t buffer :24; Comment: We originally had this index partitioning based on `ODP_CONFIG_POOLS`. Do we want to return to that here? If not, we at least need an `ODP_STATIC_ASSERT()` to ensure that `ODP_CONFIG_POOLS < 256` or else bad things will happen here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > This routine can be optimized to: > ``` > return ring->head == ring->tail; > ``` >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Your invariant is the queue is empty when `head == tail` therefore the queue >> is full when `abs(tail - head) == mask`, so the correct calculation here is: >> >> `num = mask - abs(tail - head);` >> >> The effect is that a queue can only hold `size - 1` elements, otherwise you >> cannot distinguish between a full and an empty queue without another bit of >> metadata, which is a cost you're trying to avoid. >> >> This is somewhat problematic if the caller is trying to be "optimal" by >> specifying a power of two in the `size` parameter of the `odp_queue_param_t` >> passed to `odp_queue_create()`. For this reason we may wish to return a >> `max_size` of a power of 2 - 1 in `odp_queue_capability()` as part of this >> patch series. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> This only works if `size` is a power of 2. Should be documented as such, >>> since this is an internal routine. In this case an `ODP_ASSERT(size == >>> ROUNDUP_POWER2_U32(size))` for this requirement would be a useful debugging >>> aid. Bill Fischofer(Bill-Fischofer-Linaro) wrote: should be `num = abs(tail - head);` to deal with wrap arounds, otherwise may be misinterpreted as overly large since it's `uint32_t`. Note that GCC and clang recognize `abs()` and treat it as a builtin, so there's no actual `stdlib.h` call here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Extra blank line should be removed (nit). https://github.com/Linaro/odp/pull/492#discussion_r169819392 updated_at 2018-02-22 09:32:40
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/odp_pool.c line 28 @@ -296,7 +282,9 @@ static void init_buffers(pool_t *pool) memset(buf_hdr, 0, (uintptr_t)data - (uintptr_t)buf_hdr); /* Initialize buffer metadata */ - buf_hdr->index = i; + buf_hdr->index.u32= 0; + buf_hdr->index.pool = pool->pool_idx; + buf_hdr->index.buffer = i; Comment: Since you're initializing `index.pool` and `index.buffer` there's no need to set `index.u32` here. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > We originally had this index partitioning based on `ODP_CONFIG_POOLS`. Do we > want to return to that here? > > If not, we at least need an `ODP_STATIC_ASSERT()` to ensure that > `ODP_CONFIG_POOLS < 256` or else bad things will happen here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> This routine can be optimized to: >> ``` >> return ring->head == ring->tail; >> ``` >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Your invariant is the queue is empty when `head == tail` therefore the >>> queue is full when `abs(tail - head) == mask`, so the correct calculation >>> here is: >>> >>> `num = mask - abs(tail - head);` >>> >>> The effect is that a queue can only hold `size - 1` elements, otherwise you >>> cannot distinguish between a full and an empty queue without another bit of >>> metadata, which is a cost you're trying to avoid. >>> >>> This is somewhat problematic if the caller is trying to be "optimal" by >>> specifying a power of two in the `size` parameter of the >>> `odp_queue_param_t` passed to `odp_queue_create()`. For this reason we may >>> wish to return a `max_size` of a power of 2 - 1 in `odp_queue_capability()` >>> as part of this patch series. Bill Fischofer(Bill-Fischofer-Linaro) wrote: This only works if `size` is a power of 2. Should be documented as such, since this is an internal routine. In this case an `ODP_ASSERT(size == ROUNDUP_POWER2_U32(size))` for this requirement would be a useful debugging aid. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > should be `num = abs(tail - head);` to deal with wrap arounds, otherwise > may be misinterpreted as overly large since it's `uint32_t`. Note that > GCC and clang recognize `abs()` and treat it as a builtin, so there's no > actual `stdlib.h` call here. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> Extra blank line should be removed (nit). https://github.com/Linaro/odp/pull/492#discussion_r169821150 updated_at 2018-02-22 09:32:40
Re: [lng-odp] [PATCH v2] Ring implementation of queues
Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: platform/linux-generic/include/odp_ring_st_internal.h line 78 @@ -0,0 +1,118 @@ +/* Copyright (c) 2018, Linaro Limited + * All rights reserved. + * + * SPDX-License-Identifier: BSD-3-Clause + */ + +#ifndef ODP_RING_ST_INTERNAL_H_ +#define ODP_RING_ST_INTERNAL_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include + +/* Basic ring for single thread usage. Operations must be synchronized by using + * locks (or other means), when multiple threads use the same ring. */ +typedef struct { + uint32_t head; + uint32_t tail; + uint32_t mask; + uint32_t *data; + +} ring_st_t; + +/* Initialize ring. Ring size must be a power of two. */ +static inline void ring_st_init(ring_st_t *ring, uint32_t *data, uint32_t size) +{ + ring->head = 0; + ring->tail = 0; + ring->mask = size - 1; + ring->data = data; +} + +/* Dequeue data from the ring head. Max_num is smaller than ring size.*/ +static inline uint32_t ring_st_deq_multi(ring_st_t *ring, uint32_t data[], +uint32_t max_num) +{ + uint32_t head, tail, mask, idx; + uint32_t num, i; + + head = ring->head; + tail = ring->tail; + mask = ring->mask; + num = tail - head; + + /* Empty */ + if (num == 0) + return 0; + + if (num > max_num) + num = max_num; + + idx = head & mask; + + for (i = 0; i < num; i++) { + data[i] = ring->data[idx]; + idx = (idx + 1) & mask; + } + + ring->head = head + num; + + return num; +} + +/* Enqueue data into the ring tail. Num_data is smaller than ring size. */ +static inline uint32_t ring_st_enq_multi(ring_st_t *ring, const uint32_t data[], +uint32_t num_data) +{ + uint32_t head, tail, mask, size, idx; + uint32_t num, i; + + head = ring->head; + tail = ring->tail; + mask = ring->mask; + size = mask + 1; + num = size - (tail - head); Comment: Your invariant is the queue is empty when `head == tail` therefore the queue is full when `abs(tail - head) == mask`, so the correct calculation here is: `num = mask - abs(tail - head);` The effect is that a queue can only hold `size - 1` elements, otherwise you cannot distinguish between a full and an empty queue without another bit of metadata, which is a cost you're trying to avoid. This is somewhat problematic if the caller is trying to be "optimal" by specifying a power of two in the `size` parameter of the `odp_queue_param_t` passed to `odp_queue_create()`. For this reason we may wish to return a `max_size` of a power of 2 - 1 in `odp_queue_capability()` as part of this patch series. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > This only works if `size` is a power of 2. Should be documented as such, > since this is an internal routine. In this case an `ODP_ASSERT(size == > ROUNDUP_POWER2_U32(size))` for this requirement would be a useful debugging > aid. >> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >> should be `num = abs(tail - head);` to deal with wrap arounds, otherwise may >> be misinterpreted as overly large since it's `uint32_t`. Note that GCC and >> clang recognize `abs()` and treat it as a builtin, so there's no actual >> `stdlib.h` call here. >>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>> Extra blank line should be removed (nit). https://github.com/Linaro/odp/pull/492#discussion_r169794712 updated_at 2018-02-22 09:32:40