When a primary handles a pdump enable/disable request from a secondary
(e.g. dpdk-dumpcap), it applies the callbacks locally and forwards the
request to the other secondaries before replying. The forward used
rte_mp_request_sync(), blocking up to MP_TIMEOUT_S for every secondary
to acknowledge.
A secondary that is attached but has not called rte_pdump_init() never
registers the mp_pdump action and silently drops the message (logging
only "Cannot find action: mp_pdump"). Commit c3ceb8742295 ("pdump:
forward callback enable to secondary process"), which added this
broadcast, appears to assume every secondary calls rte_pdump_init();
when one does not, the primary cannot tell it from a slow or dead peer
and blocks for the full timeout. That delay pushes the reply past the
requester's own timeout, so dpdk-dumpcap reports "Packet dump enable
... failed Connection timed out".
Forward the request asynchronously with rte_mp_request_async() so an
unresponsive secondary can no longer block the reply; a callback logs a
warning if not all secondaries acknowledge. The forward is still issued
before the requester reply, preserving the ordering from commit
928f43e3f9c1 ("pdump: fix race in disabling").
This slightly weakens the happens-before guarantee: the sync call was a
full-acknowledgement barrier, whereas the async path relies on AF_UNIX
SOCK_DGRAM FIFO ordering and the single mp_handle thread processing the
forward before the reply that triggers teardown. This suffices for the
disable race fixed by 928f43e3f9c1, but reviewers should confirm they
are comfortable with the weaker guarantee.
Fixes: c3ceb8742295 ("pdump: forward callback enable to secondary process")
Cc: [email protected]
Signed-off-by: Pushpendra Kumar <[email protected]>
---
.mailmap | 1 +
lib/pdump/rte_pdump.c | 23 ++++++++++++++---------
2 files changed, 15 insertions(+), 9 deletions(-)
diff --git a/.mailmap b/.mailmap
index c5bc728fae..c1313c7148 100644
--- a/.mailmap
+++ b/.mailmap
@@ -1332,6 +1332,7 @@ Przemyslaw Gierszynski <[email protected]>
Przemyslaw Patynowski <[email protected]>
Przemyslaw Zegan <[email protected]>
Pu Xu <[email protected]>
+Pushpendra Kumar <[email protected]>
Qi Fu <[email protected]>
Qi Zhang <[email protected]>
Qian Hao <[email protected]>
diff --git a/lib/pdump/rte_pdump.c b/lib/pdump/rte_pdump.c
index ac94efe7ff..f2d5b70111 100644
--- a/lib/pdump/rte_pdump.c
+++ b/lib/pdump/rte_pdump.c
@@ -477,11 +477,21 @@ set_pdump_rxtx_cbs(const struct pdump_request *p)
return ret;
}
+/* Async reply handler; warns if any secondary did not respond. */
+static int
+pdump_secondary_reply(const struct rte_mp_msg *request __rte_unused,
+ const struct rte_mp_reply *reply)
+{
+ if (reply->nb_sent != reply->nb_received)
+ PDUMP_LOG_LINE(ERR, "not all secondary's replied (sent %u recv
%u)",
+ reply->nb_sent, reply->nb_received);
+ return 0;
+}
+
static void
pdump_request_to_secondary(const struct pdump_request *req)
{
struct rte_mp_msg mp_req = { };
- struct rte_mp_reply mp_reply;
struct timespec ts = {.tv_sec = MP_TIMEOUT_S, .tv_nsec = 0};
PDUMP_LOG_LINE(DEBUG, "forward req %s to secondary",
pdump_opname(req->op));
@@ -490,14 +500,9 @@ pdump_request_to_secondary(const struct pdump_request *req)
strlcpy(mp_req.name, PDUMP_MP, sizeof(mp_req.name));
mp_req.len_param = sizeof(*req);
- if (rte_mp_request_sync(&mp_req, &mp_reply, &ts) != 0)
- PDUMP_LOG_LINE(ERR, "rte_mp_request_sync failed");
-
- else if (mp_reply.nb_sent != mp_reply.nb_received)
- PDUMP_LOG_LINE(ERR, "not all secondary's replied (sent %u recv
%u)",
- mp_reply.nb_sent, mp_reply.nb_received);
-
- free(mp_reply.msgs);
+ /* Forward asynchronously so an unresponsive secondary cannot block the
requester reply. */
+ if (rte_mp_request_async(&mp_req, &ts, pdump_secondary_reply) != 0)
+ PDUMP_LOG_LINE(ERR, "rte_mp_request_async failed");
}
/* Allocate temporary storage for passing state to the alarm thread for
deferred handling */
--
2.43.0