Hello commit authors (and maintainers),

I'm currently working with rte_flow_async_create() using the postpone flag,
along with rte_flow_push/pull() for batching, in a scenario involving
thousands of flows on a BlueField-2 system.

My goal is to implement hardware steering such that ingress traffic
bypasses the ARM core of the BF2, and egress traffic does the same.

According to the DPDK documentation, rte_flow_push/pull() seems to be
intended for use as a batch operation, wrapping a large for loop that
issues multiple flow operations, and then committing them to hardware in
one go.

However, I’ve observed that when multiple cores simultaneously insert flow
rules, using rte_flow_push/pull() in such a batched way can result in the
rule insertion operations not being properly transmitted to the hardware.
Specifically, the internal function mlx5dr_send_all_dep_wqe() ends up
getting stuck in its while loop.

Interestingly, if I call rte_flow_push/pull() after *each* individual
rte_flow_async_create() operation, even though that usage seems contrary to
the intended batching model, the infinite loop issue is significantly
mitigated. The frequency of getting stuck in mlx5dr_send_all_dep_wqe()
drops drastically—though it still occurs occasionally.

In summary, calling rte_flow_push/pull() after each rte_flow_async_create()
seems to avoid the infinite loop, but I’m unsure if this is an expected
usage pattern. I would like to ask:

   -

   Is this behavior intentional?
   -

   Am I misunderstanding the design or usage expectations for
   rte_flow_push/pull() in multi-core scenarios?

Thank you for your time and support.
Sincerely,
*Seongjong Bae *M.S. Student T-Networking Lab.

*Email* sjbae1...@gmail.com
*Mobile* (+82)01089640524
*Web.* https://tnet.snu.ac.kr/

Reply via email to