This patch updates programmer's guide to demonstrate the usage
and limitations of cryptodev symmetric crypto data-path APIs.

Signed-off-by: Fan Zhang <roy.fan.zh...@intel.com>
---
 doc/guides/prog_guide/cryptodev_lib.rst | 266 ++++++++++++++++++++++++
 doc/guides/rel_notes/release_20_08.rst  |   8 +
 2 files changed, 274 insertions(+)

diff --git a/doc/guides/prog_guide/cryptodev_lib.rst 
b/doc/guides/prog_guide/cryptodev_lib.rst
index c14f750fa..9900a593a 100644
--- a/doc/guides/prog_guide/cryptodev_lib.rst
+++ b/doc/guides/prog_guide/cryptodev_lib.rst
@@ -861,6 +861,272 @@ using one of the crypto PMDs available in DPDK.
                                             num_dequeued_ops);
     } while (total_num_dequeued_ops < num_enqueued_ops);
 
+Cryptodev Direct Symmetric Crypto Data-path APIs
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Direct symmetric crypto data-path APIs are a set of APIs that especially
+provided for Symmetric HW Crypto PMD that provides fast data-path
+enqueue/dequeue operations. The direct data-path APIs take advantage of
+existing Cryptodev APIs for device, queue pairs, and session management. In
+addition the user are required to get the queue pair pointer data and function
+pointers. The APIs are provided as an advanced feature as an alternative
+to ``rte_cryptodev_enqueue_burst`` and ``rte_cryptodev_dequeue_burst``. The
+APIs are designed for the user to develop close-to-native performance symmetric
+crypto data-path implementation for their applications that do not necessarily
+depend on cryptodev operations and cryptodev operation mempools, or mbufs.
+
+Cryptodev PMDs who supports this feature will have
+``RTE_CRYPTODEV_FF_SYM_HW_DIRECT_API`` feature flag presented. The user uses
+``rte_cryptodev_sym_get_hw_ops`` function call to get all the function pointers
+for different enqueue and dequeue operations, plus the device specific
+queue pair data. After the ``rte_crypto_hw_ops`` structure is properly set by
+the driver, the user can use the function pointers and the queue data pointers
+in the structure to enqueue and dequeue crypto jobs.
+
+To simply the enqueue APIs a symmetric job structure is defined:
+
+.. code-block:: c
+
+       /**
+        * Asynchronous operation job descriptor.
+        * Used by HW crypto devices direct API call that supports such activity
+        **/
+       struct rte_crypto_sym_job {
+               union {
+                       /**
+                        * When RTE_CRYPTO_HW_ENQ_FLAG_IS_SGL bit is set in 
flags, sgl
+                        * field is used as input data. Otherwise data_iova is
+                        * used.
+                        **/
+                       rte_iova_t data_iova;
+                       struct rte_crypto_sgl *sgl;
+               };
+               union {
+                       /**
+                        * Different than cryptodev ops, all ofs and len fields 
have
+                        * the unit of bytes (including Snow3G/Kasumi/Zuc.
+                        **/
+                       struct {
+                               uint32_t cipher_ofs;
+                               uint32_t cipher_len;
+                       } cipher_only;
+                       struct {
+                               uint32_t auth_ofs;
+                               uint32_t auth_len;
+                               rte_iova_t digest_iova;
+                       } auth_only;
+                       struct {
+                               uint32_t aead_ofs;
+                               uint32_t aead_len;
+                               rte_iova_t tag_iova;
+                               uint8_t *aad;
+                               rte_iova_t aad_iova;
+                       } aead;
+                       struct {
+                               uint32_t cipher_ofs;
+                               uint32_t cipher_len;
+                               uint32_t auth_ofs;
+                               uint32_t auth_len;
+                               rte_iova_t digest_iova;
+                       } chain;
+               };
+               uint8_t *iv;
+               rte_iova_t iv_iova;
+       };
+
+Different than Cryptodev operation, the ``rte_crypto_sym_job`` structure
+focuses only on the data field required for crypto PMD to execute a single job,
+and is not supposed stored as opaque data. The user can freely allocate the
+structure buffer from stack and reuse it to fill all jobs.
+
+To use the direct symmetric crypto APIs safely, the user has to carefully
+set the correct fields in rte_crypto_sym_job structure, otherwise the
+application or the system may crash. Also there are a few limitations to the
+direct symmetric crypto APIs:
+
+* Only support in-place operations.
+* APIs are NOT thread-safe.
+* CANNOT mix the direct API's enqueue with rte_cryptodev_enqueue_burst, or
+  vice versa.
+
+The following sample code shows how to use Cryptodev direct API to process a
+user defined frame with maximum 32 buffers with AES-CBC and HMAC-SHA chained
+algorithm of a frame defined by user.
+
+See *DPDK API Reference* for details on each API definitions.
+
+.. code-block:: c
+
+       #include <rte_cryptodev.h>
+
+       #define FRAME_ELT_OK 0
+       #define FRAME_ELT_FAIL 1
+       #define FRAME_OK 0
+       #define FRAME_SOME_ELT_ERROR 1
+       #define FRAME_SIZE 32
+
+       /* Sample frame element struct */
+       struct sample_frame_elt {
+               /* The status field of frame element */
+               uint8_t status;
+               /* Pre-created and initialized cryptodev session */
+               struct rte_cryptodev_sym_session *session;
+               union {
+                       __rte_iova_t data;
+                       struct rte_crypto_sgl sgl;
+               };
+               uint32_t data_len;
+               __rte_iova_t digest;
+               uint8_t *iv;
+               uint8_t is_sgl;
+       };
+
+       /* Sample frame struct to describe up to 32 crypto jobs */
+       struct sample_frame {
+               struct sample_frame_elt elts[FRAME_SIZE]; /**< All frame 
elements */
+               uint32_t n_elts; /**< Number of elements */
+       };
+
+       /* Global Cryptodev Direct API structure */
+       static struct rte_crypto_hw_ops hw_ops;
+
+       /* Initialization */
+       static int
+       frame_operation_init(
+               uint8_t cryptodev_id, /**< Initialized cryptodev ID */
+               uint16_t qp_id /**< Initialized queue pair ID */)
+       {
+               /* Get APIs */
+               ret = rte_cryptodev_sym_get_hw_ops(cryptodev_id, qp_id, 
&hw_ops);
+               /* If the device does not support this feature or queue pair is 
not
+                  initialized, return -1 */
+               if (!ret)
+                       return -1;
+               return 0;
+       }
+
+       /* Frame enqueue function use direct AES-CBC-* + HMAC-SHA* API */
+       static int
+       enqueue_frame_to_direct_api(
+               struct sample_frame *frame /**< Initialized user frame struct 
*/)
+       {
+               struct rte_crypto_hw_ops hw_ops;
+               struct rte_crypto_sym_job job;
+               uint64_t drv_data, flags = 0;
+               uint32_t i;
+               int ret;
+
+               /* Fill all sample frame element data into HW queue pair */
+               for (i = 0; i < frame->n_elts; i++) {
+                       struct sample_frame_elt *fe = &frame->elts[i];
+                       int ret;
+
+                       /* if it is the first element in the frame, set FIRST 
flag to
+                          let the driver to know it is first frame and fill 
drv_data. */
+                       if (i == 0)
+                               flags |= RTE_CRYPTO_HW_ENQ_FLAG_START;
+                       else
+                               flags &= ~RTE_CRYPTO_HW_ENQ_FLAG_START;
+
+                       /* if it is the last element in the frame, write LAST 
flag to
+                          kick HW queue */
+                       if (i == frame->n_elts - 1)
+                               flags |= RTE_CRYPTO_HW_ENQ_FLAG_LAST;
+                       else
+                               flags &= ~RTE_CRYPTO_HW_ENQ_FLAG_LAST;
+
+                       /* Fill the job data with frame element data */
+                       if (fe->is_sgl != 0) {
+                               /* The buffer is a SGL buffer */
+                               job.sgl = &frame->sgl;
+                               /* Set SGL flag */
+                               flags |= RTE_CRYPTO_HW_ENQ_FLAG_IS_SGL;
+                       } else {
+                               job.data_iova = fe->data;
+                               /* Unset SGL flag in the job */
+                               flags &= ~RTE_CRYPTO_HW_ENQ_FLAG_IS_SGL;
+                       }
+
+                       job.chain.cipher_ofs = job.chain.auth_ofs = 0;
+                       job.chain.cipher_len = job.chain.auth_len = 
fe->data_len;
+                       job.chain.digest_iova = fe->digest;
+
+                       job.iv = fe->iv;
+
+                       /* Call direct data-path enqueue chaining op API */
+                       ret = hw_ops.enqueue_chain(hw_ops.qp, fe->session, &job,
+                               (void *frame), &drv_data, flags);
+                       /**
+                        * In case one element is failed to be enqueued, simply 
abandon
+                        * enqueuing the whole frame.
+                        **/
+                       if (!ret)
+                               return -1;
+
+                       /**
+                        * To this point the frame is enqueued. The job buffer 
can be
+                        * safely reused for enqueuing next frame element.
+                        **/
+               }
+
+               return 0;
+       }
+
+       /**
+        * Sample function to write frame element status field based on
+        * driver returned operation result. The function return and parameter
+        * should follow the prototype rte_crpyto_hw_user_post_deq_cb_fn() in
+        * rte_cryptodev.h
+        **/
+       static __rte_always_inline void
+       write_frame_elt_status(void *data, uint32_t index, uint8_t 
is_op_success)
+       {
+               struct sample_frame *frame = data;
+               frame->elts[index + 1].status = is_op_success ? FRAME_ELT_OK :
+                       FRAME_ELT_FAIL;
+       }
+
+       /* Frame dequeue function use direct dequeue API */
+       static struct sample_frame *
+       dequeue_frame_with_direct_api(void)
+       {
+               struct sample_frame *ret_frame;
+               uint64_t flags, drv_data;
+               uint32_t n, n_fail, n_fail_first = 0;
+               int ret;
+
+               /* Dequeue first job, which should have frame data stored in 
opaque */
+               flags = RTE_CRYPTO_HW_DEQ_FLAG_START;
+               ret_frame = hw_ops.dequeue_one(hw_ops.qp, &drv_data, flags, 
&ret);
+               if (ret == 0) {
+                       /* ret == 0, means it is still under processing */
+                       return NULL;
+               } else if (ret == 1) {
+                       /* ret_frame is successfully retrieved, the ret stores 
the
+                          operation result */
+                       ret_frame->elts[0].status = FRAME_ELT_OK;
+               } else {
+                       ret_frame->elts[0].status = FRAME_ELT_FAIL;
+                       n_fail_first = 1;
+               }
+
+               /* Query if n_elts has been processed, if not return NULL */
+               if (!hw_ops.query_processed(hw_ops.qp, frame->n_elts))
+                       return NULL;
+
+               /* We are sure all elements have been processed, dequeue them 
all */
+               flag = 0;
+               ret = hw_ops.dequeue_many(hw_ops.qp, &drv_data, (void 
*)ret_frame,
+                       write_frame_elt_status, ret_frame->n_elts - 1, flag, 
&n_fail);
+
+               if (n_fail + n_fail_first > 0)
+                       ret_frame->status = FRAME_SOME_ELT_ERROR;
+               else
+                       ret_frame->status = FRAME_OK;
+
+               return ret_frame;
+       }
+
 Asymmetric Cryptography
 -----------------------
 
diff --git a/doc/guides/rel_notes/release_20_08.rst 
b/doc/guides/rel_notes/release_20_08.rst
index 39064afbe..eb973693d 100644
--- a/doc/guides/rel_notes/release_20_08.rst
+++ b/doc/guides/rel_notes/release_20_08.rst
@@ -56,6 +56,14 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
+   * **Add Cryptodev data-path APIs for no mbuf-centric data-path.**
+
+     Cryptodev is added a set of data-path APIs that are not based on
+     cryptodev operations. The APIs are designed for external applications
+     or libraries that want to use cryptodev but their data-path
+     implementations are not mbuf-centric. QAT Symmetric PMD is also updated
+     to add the support to this API.
+
 
 Removed Items
 -------------
-- 
2.20.1

Reply via email to