On Tue, Jun 09, 2026 at 02:02:02PM -0700, Stephen Hemminger wrote:
> Add rte_telemetry_register_cmd_fd_arg() to register a command whose
> callback also receives file descriptors passed by the client as
> SCM_RIGHTS ancillary data. The callback owns the descriptors and must
> close them.
> 
> This lets a client open a file itself and hand the descriptor to the
> primary process, so DPDK never opens the path. That avoids path and
> permission problems and works across container filesystem namespaces.
> 
> Existing commands and clients are unaffected. If unsolicited file
> descriptor is passed, it is closed.
> 

This scheme seems reasonable in general. My only concern is whether the
lack of potential windows support is an issue? For regular telemetry, there
was always the option of a windows implementation using regular
TCP/UDP/SCTP sockets bound to localhost. However, AFAIK there is no windows
implementation of anything that supports file descriptors or handles
between processes.

Some other pieces of feedback inline below.

/Bruce

> Signed-off-by: Stephen Hemminger <[email protected]>
> ---
>  doc/guides/rel_notes/release_26_07.rst |   5 ++
>  lib/telemetry/rte_telemetry.h          |  66 ++++++++++++++
>  lib/telemetry/telemetry.c              | 115 ++++++++++++++++++++++---
>  3 files changed, 174 insertions(+), 12 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/release_26_07.rst 
> b/doc/guides/rel_notes/release_26_07.rst
> index b5285af5fe..d7a2df88c1 100644
> --- a/doc/guides/rel_notes/release_26_07.rst
> +++ b/doc/guides/rel_notes/release_26_07.rst
> @@ -141,6 +141,11 @@ New Features
>    Added AGENTS.md file for AI review
>    and supporting scripts to review patches and documentation.
>  
> +* **Added telemetry support for passing file descriptors.**
> +
> +  Add experimental telemetry callback ``rte_telemetry_register_cmd_fd_arg()``
> +  to allow command to receive file descriptors passed by client.
> +
>  
>  Removed Items
>  -------------
> diff --git a/lib/telemetry/rte_telemetry.h b/lib/telemetry/rte_telemetry.h
> index 0a58e518f7..3e32d2902b 100644
> --- a/lib/telemetry/rte_telemetry.h
> +++ b/lib/telemetry/rte_telemetry.h
> @@ -325,6 +325,37 @@ typedef int (*telemetry_cb)(const char *cmd, const char 
> *params,
>  typedef int (*telemetry_arg_cb)(const char *cmd, const char *params, void 
> *arg,
>               struct rte_tel_data *info);
>  
> +/**
> + * This telemetry callback is used when registering a telemetry command with
> + * rte_telemetry_register_cmd_fd_arg().
> + *
> + * It behaves like telemetry_arg_cb, but additionally receives any file
> + * descriptors the client passed alongside the command as SCM_RIGHTS 
> ancillary
> + * data. The callback takes ownership of these descriptors and is responsible
> + * for closing them.
> + *
> + * @param cmd
> + *   The cmd that was requested by the client.
> + * @param params
> + *   Contains data required by the callback function.
> + * @param arg
> + *   The opaque value that was passed to rte_telemetry_register_cmd_fd_arg().
> + * @param fds
> + *   Array of file descriptors received from the client. May be NULL when
> + *   n_fds is zero.
> + * @param n_fds
> + *   Number of file descriptors in the fds array.
> + * @param info
> + *   The information to be returned to the caller.
> + *
> + * @return
> + *   Length of buffer used on success.
> + * @return
> + *   Negative integer on error.
> + */
> +typedef int (*telemetry_fd_cb)(const char *cmd, const char *params, void 
> *arg,
> +             const int *fds, unsigned int n_fds, struct rte_tel_data *info);
> +

Do we anticipate in future having callbacks taking more than one FD? Would
it not be simpler just to a single fd parameter (which is -1 on no fd
passed).

>  /**
>   * Used when registering a command and callback function with telemetry.
>   *
> @@ -368,6 +399,41 @@ __rte_experimental
>  int
>  rte_telemetry_register_cmd_arg(const char *cmd, telemetry_arg_cb fn, void 
> *arg, const char *help);
>  
> +/**
> + * Register a command and a file-descriptor-aware callback with telemetry.
> + *
> + * The callback is invoked like rte_telemetry_register_cmd_arg(), but also
> + * receives any file descriptors the client passed alongside the command as
> + * SCM_RIGHTS ancillary data. This lets a client open a file (for example a
> + * capture output file) itself and hand the descriptor to the DPDK process,
> + * which never opens the path - avoiding path and permission concerns and
> + * working across container filesystem namespaces.
> + *
> + * Descriptors sent to a command registered with rte_telemetry_register_cmd()
> + * or rte_telemetry_register_cmd_arg() are rejected and the connection is
> + * closed.
> + *
> + * @param cmd
> + *   The command to register with telemetry.
> + * @param fn
> + *   Callback function to be called when the command is requested.
> + * @param arg
> + *   An opaque value that will be passed to the callback function.
> + * @param help
> + *   Help text for the command.
> + *
> + * @return
> + *   0 on success.
> + * @return
> + *   -EINVAL for invalid parameters failure.
> + * @return
> + *   -ENOMEM for mem allocation failure.
> + */
> +__rte_experimental
> +int
> +rte_telemetry_register_cmd_fd_arg(const char *cmd, telemetry_fd_cb fn, void 
> *arg,
> +             const char *help);
> +

Do we want to make this experimental for later stabilization, or is this an
API that is best kept as internal-only? I'd tend towards keeping it
internal-only, rather than allowing apps to define callbacks which take
FDs.

>  /**
>   * @internal
>   * Free a container that has memory allocated.
> diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
> index b109d076d4..30d3ae3a13 100644
> --- a/lib/telemetry/telemetry.c
> +++ b/lib/telemetry/telemetry.c
> @@ -29,6 +29,8 @@
>  #define MAX_CMD_LEN 56
>  #define MAX_OUTPUT_LEN (1024 * 16)
>  #define MAX_CONNECTIONS 10
> +/* Maximum number of file descriptors a client may pass with one command. */
> +#define MAX_FDS 8
>  

As above, do we really need multiple FDs?

>  #ifndef RTE_EXEC_ENV_WINDOWS
>  static void *
> @@ -39,6 +41,7 @@ struct cmd_callback {
>       char cmd[MAX_CMD_LEN];
>       telemetry_cb fn;
>       telemetry_arg_cb fn_arg;
> +     telemetry_fd_cb fn_fd;
>       void *arg;
>       char help[RTE_TEL_MAX_STRING_LEN];
>  };
> @@ -72,15 +75,15 @@ static RTE_ATOMIC(uint16_t) v2_clients;
>  #endif /* !RTE_EXEC_ENV_WINDOWS */
>  
>  static int
> -register_cmd(const char *cmd, const char *help,
> -          telemetry_cb fn, telemetry_arg_cb fn_arg, void *arg)
> +register_cmd(const char *cmd, const char *help, telemetry_cb fn,
> +          telemetry_arg_cb fn_arg, telemetry_fd_cb fn_fd, void *arg)
>  {
>       struct cmd_callback *new_callbacks;
>       const char *cmdp = cmd;
>       int i = 0;
>  
> -     if (strlen(cmd) >= MAX_CMD_LEN || (fn == NULL && fn_arg == NULL) || 
> cmd[0] != '/'
> -                     || strlen(help) >= RTE_TEL_MAX_STRING_LEN)
> +     if (strlen(cmd) >= MAX_CMD_LEN || (fn == NULL && fn_arg == NULL && 
> fn_fd == NULL)
> +                     || cmd[0] != '/' || strlen(help) >= 
> RTE_TEL_MAX_STRING_LEN)
>               return -EINVAL;
>  
>       while (*cmdp != '\0') {
> @@ -107,6 +110,7 @@ register_cmd(const char *cmd, const char *help,
>       strlcpy(callbacks[i].cmd, cmd, MAX_CMD_LEN);
>       callbacks[i].fn = fn;
>       callbacks[i].fn_arg = fn_arg;
> +     callbacks[i].fn_fd = fn_fd;
>       callbacks[i].arg = arg;
>       strlcpy(callbacks[i].help, help, RTE_TEL_MAX_STRING_LEN);
>       num_callbacks++;
> @@ -119,14 +123,22 @@ RTE_EXPORT_SYMBOL(rte_telemetry_register_cmd)
>  int
>  rte_telemetry_register_cmd(const char *cmd, telemetry_cb fn, const char 
> *help)
>  {
> -     return register_cmd(cmd, help, fn, NULL, NULL);
> +     return register_cmd(cmd, help, fn, NULL, NULL, NULL);
>  }
>  
>  RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_telemetry_register_cmd_arg, 24.11)
>  int
>  rte_telemetry_register_cmd_arg(const char *cmd, telemetry_arg_cb fn, void 
> *arg, const char *help)
>  {
> -     return register_cmd(cmd, help, NULL, fn, arg);
> +     return register_cmd(cmd, help, NULL, fn, NULL, arg);
> +}
> +
> +RTE_EXPORT_EXPERIMENTAL_SYMBOL(rte_telemetry_register_cmd_fd_arg, 26.07)
> +int
> +rte_telemetry_register_cmd_fd_arg(const char *cmd, telemetry_fd_cb fn, void 
> *arg,
> +             const char *help)
> +{
> +     return register_cmd(cmd, help, NULL, NULL, fn, arg);
>  }
>  
>  #ifndef RTE_EXEC_ENV_WINDOWS
> @@ -368,13 +380,70 @@ output_json(const char *cmd, const struct rte_tel_data 
> *d, int s)
>               TMTY_LOG_LINE(ERR, "Error writing to socket: %s", 
> strerror(errno));
>  }
>  
> +/*
> + * Receive a command and any file descriptors the client passed alongside it
> + * as SCM_RIGHTS ancillary data. The payload length is returned (0 if the
> + * client sent an empty message or closed the connection, negative on error).
> + * Descriptors that arrive are returned in fds[]/n_fds and are owned by the
> + * caller. MSG_CTRUNC means more descriptors were sent than the control 
> buffer
> + * could hold; *ctrunc is set so the caller can reject the command, but the
> + * descriptors that did fit are still returned so they can be closed rather
> + * than leaked.
> + */
> +static int
> +recv_with_fds(int s, char *buf, size_t buf_len, int *fds, unsigned int 
> *n_fds,
> +           bool *ctrunc)
> +{
> +     char cmsgbuf[CMSG_SPACE(sizeof(int) * MAX_FDS)];
> +     struct iovec iov = { .iov_base = buf, .iov_len = buf_len };
> +     struct msghdr msg = {
> +             .msg_iov = &iov,
> +             .msg_iovlen = 1,
> +             .msg_control = cmsgbuf,
> +             .msg_controllen = sizeof(cmsgbuf),
> +     };
> +     struct cmsghdr *cmsg;
> +     int bytes;
> +
> +     *n_fds = 0;
> +     *ctrunc = false;
> +
> +     bytes = recvmsg(s, &msg, 0);
> +     if (bytes < 0)
> +             return bytes;
> +
> +     if (msg.msg_flags & MSG_CTRUNC)
> +             *ctrunc = true;
> +
> +     for (cmsg = CMSG_FIRSTHDR(&msg); cmsg != NULL; cmsg = CMSG_NXTHDR(&msg, 
> cmsg)) {
> +             if (cmsg->cmsg_level != SOL_SOCKET || cmsg->cmsg_type != 
> SCM_RIGHTS)
> +                     continue;
> +             *n_fds = (cmsg->cmsg_len - CMSG_LEN(0)) / sizeof(int);
> +             memcpy(fds, CMSG_DATA(cmsg), *n_fds * sizeof(int));
> +             break;
> +     }
> +     return bytes;
> +}
> +
>  static void
> -perform_command(const struct cmd_callback *cb, const char *cmd, const char 
> *param, int s)
> +close_fds(const int *fds, unsigned int n_fds)
> +{
> +     unsigned int i;
> +
> +     for (i = 0; i < n_fds; i++)
> +             close(fds[i]);
> +}
> +
> +static void
> +perform_command(const struct cmd_callback *cb, const char *cmd, const char 
> *param,
> +             const int *fds, unsigned int n_fds, int s)
>  {
>       struct rte_tel_data data = {0};
>       int ret;
>  
> -     if (cb->fn_arg != NULL)
> +     if (cb->fn_fd != NULL)
> +             ret = cb->fn_fd(cmd, param, cb->arg, fds, n_fds, &data);
> +     else if (cb->fn_arg != NULL)
>               ret = cb->fn_arg(cmd, param, cb->arg, &data);
>       else
>               ret = cb->fn(cmd, param, &data);
> @@ -412,8 +481,11 @@ client_handler(void *sock_id)
>       }
>  
>       /* receive data is not null terminated */
> -     int bytes = read(s, buffer, sizeof(buffer) - 1);
> -     while (bytes > 0) {
> +     int fds[MAX_FDS];
> +     unsigned int n_fds = 0;
> +     bool ctrunc = false;
> +     int bytes = recv_with_fds(s, buffer, sizeof(buffer) - 1, fds, &n_fds, 
> &ctrunc);
> +     while (bytes > 0 || (bytes == 0 && n_fds > 0)) {
>               buffer[bytes] = 0;
>               const char *cmd = strtok(buffer, ",");
>               const char *param = strtok(NULL, "\0");
> @@ -429,9 +501,28 @@ client_handler(void *sock_id)
>                               }
>                       rte_spinlock_unlock(&callback_sl);
>               }
> -             perform_command(&cb, cmd, param, s);
>  
> -             bytes = read(s, buffer, sizeof(buffer) - 1);
> +             /*
> +              * File descriptors go only to a command that registered to
> +              * receive them. A command that did not, or a truncated control
> +              * message, is a client error: close the descriptors and drop 
> the
> +              * connection rather than silently discarding them.
> +              */
> +             if (n_fds > 0 && (cb.fn_fd == NULL || ctrunc)) {
> +                     TMTY_LOG_LINE(ERR,
> +                             "Closing connection: %u file descriptor(s) 
> passed to '%s'%s",
> +                             n_fds, cmd ? cmd : "(none)",
> +                             ctrunc ? " (truncated)" : " which does not 
> accept them");
> +                     close_fds(fds, n_fds);
> +                     break;
> +             }
> +
> +             /* an fd-aware callback takes ownership of the descriptors */
> +             perform_command(&cb, cmd, param, fds, n_fds, s);
> +
> +             n_fds = 0;
> +             ctrunc = false;

the receive function always resets this to false anyway, so you may be able
to omit this line (assuming compiler doesn't complain).

> +             bytes = recv_with_fds(s, buffer, sizeof(buffer) - 1, fds, 
> &n_fds, &ctrunc);
>       }
>  exit:
>       close(s);
> -- 
> 2.53.0
> 

Reply via email to