In file lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h:
rte_mb() is defined as asm "sync".
rte_wmb() is defined as asm "lwsync".

mlx5_tx_dbrec_cond_wmb() uses rte_wmb() to ensure ordering between
DB record and BF copy.
For P9 processor, not having strongly-ordered memory model, this
memory barrier is not strict enough, so rte_mb() has to be used.
For x86 processor, having strongly-ordered memory model, the use
of rte_mb() instead of rte_wmb() causes up to ~10% performance hit.

This patch adds mlx5_arch_specific_mb(), defined as rte_mb() for PPC64
and as rte_wmb() for other processors.
mlx5_tx_dbrec_cond_wmb() will use mlx5_arch_specific_mb() in order to
guarantee data is valid for any processor architecture.

Original work by Yongseok Koh.

Fixes: 6cb559d67b83 ("net/mlx5: add vectorized Rx/Tx burst for x86")
Cc: sta...@dpdk.org

Signed-off-by: Dekel Peled <dek...@mellanox.com>
---
 drivers/net/mlx5/mlx5_rxtx.h  | 2 +-
 drivers/net/mlx5/mlx5_utils.h | 9 +++++++++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 53115dd..df51589 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -707,7 +707,7 @@ uint32_t mlx5_tx_update_ext_mp(struct mlx5_txq_data *txq, 
uintptr_t addr,
        rte_cio_wmb();
        *txq->qp_db = rte_cpu_to_be_32(txq->wqe_ci);
        /* Ensure ordering between DB record and BF copy. */
-       rte_wmb();
+       mlx5_arch_specific_mb();
        mlx5_uar_write64_relaxed(*src, dst, txq->uar_lock);
        if (cond)
                rte_wmb();
diff --git a/drivers/net/mlx5/mlx5_utils.h b/drivers/net/mlx5/mlx5_utils.h
index 97092c7..6742271 100644
--- a/drivers/net/mlx5/mlx5_utils.h
+++ b/drivers/net/mlx5/mlx5_utils.h
@@ -25,6 +25,15 @@
 #define bool _Bool
 #endif
 
+/*
+ * Define strict memory-barrier for PPC64.
+ */
+#if defined(__PPC64__)
+#define mlx5_arch_specific_mb() rte_mb()
+#else
+#define mlx5_arch_specific_mb() rte_wmb()
+#endif
+
 /* Bit-field manipulation. */
 #define BITFIELD_DECLARE(bf, type, size) \
        type bf[(((size_t)(size) / (sizeof(type) * CHAR_BIT)) + \
-- 
1.8.3.1

Reply via email to