Hi Konstantin, Please find some comments inline.
> -----Original Message----- > From: Ananyev, Konstantin <konstantin.anan...@intel.com> > Sent: Wednesday, September 16, 2020 7:10 PM > To: Gujjar, Abhinandan S <abhinandan.guj...@intel.com>; dev@dpdk.org > Cc: Doherty, Declan <declan.dohe...@intel.com>; jer...@marvell.com; > akhil.go...@nxp.com; Vangati, Narender <narender.vang...@intel.com>; > Gujjar, Abhinandan S <abhinandan.guj...@intel.com> > Subject: RE: [dpdk-dev] [v2 1/2] cryptodev: support enqueue callback > functions > > Hi Guijjar, > > > > > In an eventdev world, multiple workers (with ordered queue) will be > > working on IPsec ESP processing. The ESP header's sequence number is > > unique and has to be sequentially incremented in an orderly manner. > > This rises a need for incrementing sequence number in crypto stage > > especially in event crypto adapter. By adding a user callback to > > cryptodev at enqueue burst, the user callback will get executed in the > > context of event crypto adapter. This helps the application to > > increment the ESP sequence number atomically and orderly manner. > > > > This patch adds APIs to add/remove callback functions. The callback > > function will be called for each burst of crypto ops received on a > > given crypto device queue pair. > > > > v1->v2: > > Moved callback related members to the end of cryptodev struct Added > > support for RCU > > > > Signed-off-by: Abhinandan Gujjar <abhinandan.guj...@intel.com> > > --- > > config/common_base | 1 + > > lib/librte_cryptodev/Makefile | 2 +- > > lib/librte_cryptodev/rte_cryptodev.c | 157 > +++++++++++++++++++++++++ > > lib/librte_cryptodev/rte_cryptodev.h | 154 > +++++++++++++++++++++++- > > lib/librte_cryptodev/rte_cryptodev_version.map | 6 + > > 5 files changed, 318 insertions(+), 2 deletions(-) > > > > diff --git a/config/common_base b/config/common_base index > > fbf0ee7..f5ebde4 100644 > > --- a/config/common_base > > +++ b/config/common_base > > @@ -599,6 +599,7 @@ > CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC=y > > # > > CONFIG_RTE_LIBRTE_CRYPTODEV=y > > CONFIG_RTE_CRYPTO_MAX_DEVS=64 > > +CONFIG_RTE_CRYPTODEV_CALLBACKS=y > > > > # > > # Compile PMD for ARMv8 Crypto device diff --git > > a/lib/librte_cryptodev/Makefile b/lib/librte_cryptodev/Makefile index > > 73e77a2..514d552 100644 > > --- a/lib/librte_cryptodev/Makefile > > +++ b/lib/librte_cryptodev/Makefile > > @@ -10,7 +10,7 @@ LIB = librte_cryptodev.a CFLAGS += -O3 CFLAGS += > > $(WERROR_FLAGS) LDLIBS += -lrte_eal -lrte_mempool -lrte_ring > > -lrte_mbuf -LDLIBS += -lrte_kvargs > > +LDLIBS += -lrte_kvargs -lrte_rcu > > > > # library source files > > SRCS-y += rte_cryptodev.c rte_cryptodev_pmd.c > > cryptodev_trace_points.c diff --git > > a/lib/librte_cryptodev/rte_cryptodev.c > > b/lib/librte_cryptodev/rte_cryptodev.c > > index 1dd795b..2fb3e35 100644 > > --- a/lib/librte_cryptodev/rte_cryptodev.c > > +++ b/lib/librte_cryptodev/rte_cryptodev.c > > @@ -38,6 +38,7 @@ > > #include <rte_string_fns.h> > > #include <rte_compat.h> > > #include <rte_function_versioning.h> > > +#include <rte_rcu_qsbr.h> > > > > #include "rte_crypto.h" > > #include "rte_cryptodev.h" > > @@ -499,6 +500,10 @@ struct > rte_cryptodev_sym_session_pool_private_data { > > return 0; > > } > > > > +#ifdef RTE_CRYPTODEV_CALLBACKS > > +/* spinlock for crypto device enq callbacks */ static rte_spinlock_t > > +rte_cryptodev_enq_cb_lock = RTE_SPINLOCK_INITIALIZER; #endif > > > > const char * > > rte_cryptodev_get_feature_name(uint64_t flag) @@ -1449,6 +1454,158 > @@ > > struct rte_cryptodev * > > rte_spinlock_unlock(&rte_cryptodev_cb_lock); > > } > > > > +#ifdef RTE_CRYPTODEV_CALLBACKS > > +int > > +rte_cryptodev_rcu_qsbr_add(uint8_t dev_id, struct rte_rcu_qsbr *qsbr) > > +{ > > + > > + struct rte_cryptodev *dev; > > + > > + if (!rte_cryptodev_pmd_is_valid_dev(dev_id)) { > > + CDEV_LOG_ERR("Invalid dev_id=%" PRIu8, dev_id); > > + return -EINVAL; > > + } > > + > > + dev = &rte_crypto_devices[dev_id]; > > + dev->qsbr = qsbr; > > + return 0; > > +} > > So if I understand your patch correctly you propose a new working model for > crypto-devs: > 1. Control-plane has to allocate/setup rcu_qsbr and do > rte_cryptodev_rcu_qsbr_add(). > 2. Data-plane has somehow to obtain pointer to that rcu_qsbr and wrap > cryptodev_enqueue() > with rcu_qsbr_quiescent() or rcu_qsbr_online()/rcu_qsbr_offline(). Yes. I think, it is not a new model. It is same as RCU integration with LPM. Please refer: https://patches.dpdk.org/cover/73673/ > > That seems quite a big change and I don't think it is acceptable for most > users. > From my perspective adding/installing call-backs to the dev has to be opaque > to the data-plane code. > Also note that different callbacks can be installed by different entities > (libs) > and might have no idea about each other. > That's why I thought it would be better to make all this RCU stuff internal > inside cryptodev: > hide all this rcu_qsbr allocation/setup inside cryptodev init/queue setup > invoke rcu_qsbr_online()/rcu_qsbr_offline() inside cryptodev_enqueue(). I have already tried exploring above stuffs. There are too many constraints. The changes don't fit in, as per RCU design. Moreover, having rcu api under enqueue_burst() will affect the performance. The changes are more on control plane side, which is one time. The data plane changes are minimal. > > > + > > +struct rte_cryptodev_enq_callback * > > +rte_cryptodev_add_enq_callback(uint8_t dev_id, > > + uint16_t qp_id, > > + rte_cryptodev_enq_cb_fn cb_fn, > > + void *cb_arg) > > +{ > > + struct rte_cryptodev *dev; > > + struct rte_cryptodev_enq_callback *cb, *tail; > > + > > + if (!cb_fn) > > + return NULL; > > + > > + if (!rte_cryptodev_pmd_is_valid_dev(dev_id)) { > > + CDEV_LOG_ERR("Invalid dev_id=%" PRIu8, dev_id); > > + return NULL; > > + } > > + > > + dev = &rte_crypto_devices[dev_id]; > > + if (qp_id >= dev->data->nb_queue_pairs) { > > + CDEV_LOG_ERR("Invalid queue_pair_id=%d", qp_id); > > + return NULL; > > + } > > + > > + cb = rte_zmalloc(NULL, sizeof(*cb), 0); > > + if (cb == NULL) { > > + CDEV_LOG_ERR("Failed to allocate memory for callback on " > > + "dev=%d, queue_pair_id=%d", dev_id, qp_id); > > + rte_errno = ENOMEM; > > + return NULL; > > + } > > + > > + cb->fn = cb_fn; > > + cb->arg = cb_arg; > > + > > + rte_spinlock_lock(&rte_cryptodev_enq_cb_lock); > > + if (dev->enq_cbs == NULL) { > > + dev->enq_cbs = rte_zmalloc(NULL, sizeof(cb) * > > + dev->data->nb_queue_pairs, 0); > > + if (dev->enq_cbs == NULL) { > > + CDEV_LOG_ERR("Failed to allocate memory for > callbacks"); > > + rte_errno = ENOMEM; > > + rte_free(cb); > > + return NULL; > > + } > > + } > > + > > + /* Add the callbacks in fifo order. */ > > + tail = dev->enq_cbs[qp_id]; > > + if (tail) { > > + while (tail->next) > > + tail = tail->next; > > + tail->next = cb; > > + } else > > + dev->enq_cbs[qp_id] = cb; > > + > > + rte_spinlock_unlock(&rte_cryptodev_enq_cb_lock); > > + > > + return cb; > > +} > > + > > +int > > +rte_cryptodev_remove_enq_callback(uint8_t dev_id, > > + uint16_t qp_id, > > + struct rte_cryptodev_enq_callback *cb) { > > + struct rte_cryptodev *dev; > > + struct rte_cryptodev_enq_callback **prev_cb, *curr_cb; > > + uint16_t qp; > > + int free_mem; > > + int ret; > > + > > + free_mem = 1; > > + > > + if (!cb) { > > + CDEV_LOG_ERR("cb is NULL"); > > + return -EINVAL; > > + } > > + > > + if (!rte_cryptodev_pmd_is_valid_dev(dev_id)) { > > + CDEV_LOG_ERR("Invalid dev_id=%" PRIu8, dev_id); > > + return -EINVAL; > > + } > > + > > + dev = &rte_crypto_devices[dev_id]; > > + if (qp_id >= dev->data->nb_queue_pairs) { > > + CDEV_LOG_ERR("Invalid queue_pair_id=%d", qp_id); > > + return -EINVAL; > > + } > > + > > + if (!dev->qsbr) { > > + CDEV_LOG_ERR("Rcu qsbr is NULL"); > > + return -EINVAL; > > + } > > + > > + rte_spinlock_lock(&rte_cryptodev_enq_cb_lock); > > + if (dev->enq_cbs == NULL) { > > + rte_spinlock_unlock(&rte_cryptodev_enq_cb_lock); > > + return -EINVAL; > > + } > > + > > + prev_cb = &dev->enq_cbs[qp_id]; > > + for (; *prev_cb != NULL; prev_cb = &curr_cb->next) { > > + curr_cb = *prev_cb; > > + if (curr_cb == cb) { > > + /* Remove the user cb from the callback list. */ > > + *prev_cb = curr_cb->next; > > + ret = 0; > > + break; > > + } > > + } > > + > > + if (!ret) { > > + /* Call sync with invalid thread id as this is part of > > + * control plane API */ > > + rte_rcu_qsbr_synchronize(dev->qsbr, > > + RTE_QSBR_THRID_INVALID); > > + rte_free(cb); > > + } > > + > > + for (qp = 0; qp < dev->data->nb_queue_pairs; qp++) > > + if (dev->enq_cbs[qp] != NULL) { > > Some reference count (number of callbacks) seems like a better approach > here. Ok. > > > + free_mem = 0; > > + break; > > + } > > + > > + if (free_mem) { > > + rte_free(dev->enq_cbs); > > + dev->enq_cbs = NULL; > > + } > > + > > + rte_spinlock_unlock(&rte_cryptodev_enq_cb_lock); > > + > > + return ret; > > +} > > +#endif > > > > int > > rte_cryptodev_sym_session_init(uint8_t dev_id, diff --git > > a/lib/librte_cryptodev/rte_cryptodev.h > > b/lib/librte_cryptodev/rte_cryptodev.h > > index 7b3ebc2..2c7a47b 100644 > > --- a/lib/librte_cryptodev/rte_cryptodev.h > > +++ b/lib/librte_cryptodev/rte_cryptodev.h > > @@ -530,6 +530,32 @@ struct rte_cryptodev_qp_conf { }; > > > > /** > > + * Function type used for pre processing crypto ops when enqueue > > +burst is > > + * called. > > + * > > + * The callback function is called on enqueue burst immediately > > + * before the crypto ops are put onto the hardware queue for processing. > > + * > > + * @param dev_id The identifier of the device. > > + * @param qp_id The index of the queue pair in which ops are > > + * to be enqueued for processing. The value > > + * must be in the range [0, nb_queue_pairs - 1] > > + * previously supplied to > > + * *rte_cryptodev_configure*. > > + * @param ops The address of an array of *nb_ops* pointers > > + * to *rte_crypto_op* structures which contain > > + * the crypto operations to be processed. > > + * @param nb_ops The number of operations to process. > > + * @param user_param The arbitrary user parameter passed in by the > > + * application when the callback was originally > > + * registered. > > + * @return The number of ops to be enqueued to the > > + * crypto device. > > + */ > > +typedef uint16_t (*rte_cryptodev_enq_cb_fn)(uint16_t dev_id, uint16_t > qp_id, > > + struct rte_crypto_op **ops, uint16_t nb_ops, void > *user_param); > > + > > +/** > > * Typedef for application callback function to be registered by > > application > > * software for notification of device events > > * > > @@ -853,7 +879,6 @@ struct rte_cryptodev_config { > > enum rte_cryptodev_event_type event, > > rte_cryptodev_cb_fn cb_fn, void *cb_arg); > > > > - > > typedef uint16_t (*dequeue_pkt_burst_t)(void *qp, > > struct rte_crypto_op **ops, uint16_t nb_ops); > > /**< Dequeue processed packets from queue pair of a device. */ @@ > > -870,6 +895,17 @@ typedef uint16_t (*enqueue_pkt_burst_t)(void *qp, > > /** Structure to keep track of registered callbacks */ > > TAILQ_HEAD(rte_cryptodev_cb_list, rte_cryptodev_callback); > > > > +/** > > + * @internal > > + * Structure used to hold information about the callbacks to be > > +called for a > > + * queue pair on enqueue. > > + */ > > +struct rte_cryptodev_enq_callback { > > + struct rte_cryptodev_enq_callback *next; > > + rte_cryptodev_enq_cb_fn fn; > > + void *arg; > > +}; > > + > > /** The data structure associated with each crypto device. */ struct > > rte_cryptodev { > > dequeue_pkt_burst_t dequeue_burst; > > @@ -898,6 +934,14 @@ struct rte_cryptodev { > > __extension__ > > uint8_t attached : 1; > > /**< Flag indicating the device is attached */ > > + > > +#ifdef RTE_CRYPTODEV_CALLBACKS > > + struct rte_cryptodev_enq_callback **enq_cbs; > > + /**< User application callback for pre enqueue processing */ > > + > > + struct rte_rcu_qsbr *qsbr; > > + /** < RCU QSBR variable for rte_cryptodev_enq_callback */ > > Probably better to have both these fields per queue. > Space for them can be allocated at dev_configure() or so. enq_cbs is allocated during callback add. Unlike ethdev, each cryptodev have their own max queue pair. There is no macro for that. I think, single RCU should be good enough, as it has mechanism to track all its reporting threads. > BTW, wouldn't it make sense to have ability to add callback for dequeue too? As mentioned in the commit message, this patch was driven by a requirement. If required, callback for the dequeue can be added in a separate patch. > > > +#endif > > } __rte_cache_aligned; > > > > void * > > @@ -1019,6 +1063,18 @@ struct rte_cryptodev_data { > > struct rte_crypto_op **ops, uint16_t nb_ops) { > > struct rte_cryptodev *dev = &rte_cryptodevs[dev_id]; > > +#ifdef RTE_CRYPTODEV_CALLBACKS > > + if (unlikely(dev->enq_cbs != NULL && dev->enq_cbs[qp_id] != > NULL)) { > > + struct rte_cryptodev_enq_callback *cb = > > + dev->enq_cbs[qp_id]; > > + > > + do { > > + nb_ops = cb->fn(dev_id, qp_id, ops, nb_ops, > > + cb->arg); > > + cb = cb->next; > > + } while (cb != NULL); > > + } > > +#endif > > > > rte_cryptodev_trace_enqueue_burst(dev_id, qp_id, (void **)ops, > nb_ops); > > return (*dev->enqueue_burst)( > > @@ -1351,6 +1407,102 @@ struct rte_cryptodev_asym_session * > > struct rte_cryptodev_sym_session *sess, union rte_crypto_sym_ofs > ofs, > > struct rte_crypto_sym_vec *vec); > > > > +#ifdef RTE_CRYPTODEV_CALLBACKS > > +/** > > + * @warning > > + * @b EXPERIMENTAL: this API may change without prior notice > > + * > > + * Add a user callback for a given crypto device and queue pair which > > +will be > > + * called on crypto ops enqueue. > > + * > > + * This API configures a function to be called for each burst of > > +crypto ops > > + * received on a given crypto device queue pair. The return value is > > +a pointer > > + * that can be used later to remove the callback using > > + * rte_cryptodev_remove_enq_callback(). > > + * > > + * Multiple functions are called in the order that they are added. > > + * > > + * @param dev_id The identifier of the device. > > + * @param qp_id The index of the queue pair in which ops are > > + * to be enqueued for processing. The value > > + * must be in the range [0, nb_queue_pairs - 1] > > + * previously supplied to > > + * *rte_cryptodev_configure*. > > + * @param cb_fn The callback function > > + * @param cb_arg A generic pointer parameter which will be > passed > > + * to each invocation of the callback function on > > + * this crypto device and queue pair. > > + * > > + * @return > > + * NULL on error. > > + * On success, a pointer value which can later be used to remove the > callback. > > + */ > > + > > +__rte_experimental > > +struct rte_cryptodev_enq_callback * > > +rte_cryptodev_add_enq_callback(uint8_t dev_id, > > + uint16_t qp_id, > > + rte_cryptodev_enq_cb_fn cb_fn, > > + void *cb_arg); > > + > > + > > +/** > > + * @warning > > + * @b EXPERIMENTAL: this API may change without prior notice > > + * > > + * Remove a user callback function for given crypto device and queue pair. > > + * > > + * This function is used to removed callbacks that were added to a > > +crypto > > + * device queue pair using rte_cryptodev_add_enq_callback(). > > + * > > + * Note: The callback expects a RCU QSBR to be configured to > > +synchronize > > + * to free the memory. Application is expected to configure RCU QSBR > > +after > > + * adding an enqueue callback. > > + * > > + * > > + * @param dev_id The identifier of the device. > > + * @param qp_id The index of the queue pair in which ops are > > + * to be enqueued for processing. The value > > + * must be in the range [0, nb_queue_pairs - 1] > > + * previously supplied to > > + * *rte_cryptodev_configure*. > > + * @param cb Pointer to user supplied callback created via > > + * rte_cryptodev_add_enq_callback(). > > + * > > + * @return > > + * - 0: Success. Callback was removed. > > + * - -EINVAL: The dev_id or the qp_id is out of range, or the callback > > + * is NULL or not found for the crypto device queue pair. > > + */ > > + > > +__rte_experimental > > +int rte_cryptodev_remove_enq_callback(uint8_t dev_id, > > + uint16_t qp_id, > > + struct rte_cryptodev_enq_callback *cb); > > + > > + > > +/** > > + * @warning > > + * @b EXPERIMENTAL: this API may change without prior notice > > + * > > + * Associate RCU QSBR variable with a cryptodev. > > + * > > + * This function is used to add RCU QSBR to a crypto device. > > + * The purpose of RCU is to help multiple threads to synchronize > > + * with each other before initiating adding/removing callback > > + * while dataplane threads are running enqueue callbacks. > > + * > > + * @param dev_id The identifier of the device. > > + * @param qsr RCU QSBR configuration > > + * @return > > + * On success - 0 > > + * On error - EINVAL. > > + */ > > + > > +__rte_experimental > > +int rte_cryptodev_rcu_qsbr_add(uint8_t dev_id, struct rte_rcu_qsbr > > +*qsbr); #endif > > + > > #ifdef __cplusplus > > } > > #endif > > diff --git a/lib/librte_cryptodev/rte_cryptodev_version.map > > b/lib/librte_cryptodev/rte_cryptodev_version.map > > index 02f6dcf..46de3ca 100644 > > --- a/lib/librte_cryptodev/rte_cryptodev_version.map > > +++ b/lib/librte_cryptodev/rte_cryptodev_version.map > > @@ -64,6 +64,7 @@ DPDK_20.0 { > > rte_cryptodev_sym_capability_get; > > }; > > > > + > > EXPERIMENTAL { > > global: > > > > @@ -105,4 +106,9 @@ EXPERIMENTAL { > > > > # added in 20.08 > > rte_cryptodev_get_qp_status; > > + > > + # added in 20.11 > > + rte_cryptodev_add_enq_callback; > > + rte_cryptodev_remove_enq_callback; > > + rte_cryptodev_rcu_qsbr_add; > > }; > > -- > > 1.9.1