Hi Pavan,

I am resending my review here. It seems you didn't get it.

, Oct 14, 2024 at 13:58:
From: Pavan Nikhilesh <pbhagavat...@marvell.com>


Introduce the ability for nodes to advertise error counters during
registration and increment them during the node process function in
the graph library.
This enhancement allows for better error tracking and debugging
capabilities within the graph framework.

The number of errors and the mapping of error IDs to error descriptions
are defined during node registration.
If an error is encountered during the node process function while walking
the graph, the respective error counter is incremented.

This feature could be useful to store detailed statistics per node, not only for errors. It would be better to rename "errors" to "xstats".

See below for a concrete suggestion.


Example:
        static struct rte_node_errors ip4_reassembly_errors = {
                .nb_errors = 1,
                .err_desc = {
                        [0] = "ip4_reassembly_error",
                },
        };

 static const struct rte_node_xstats ip4_reassembly_xstats = {
   .xstats_num = 1,
   .xstats_desc = {
     [0] = "ip4_reassembly_error",
   },
 }


Here, "ip4_reassembly_error" is mapped to error ID 0, and the same ID is
used in the `ip4_reassembly_node_process` function to increment reassembly
errors.
Depending on the node, there can be multiple such errors that can be
updated independently and retrieved using `rte_graph_cluster_stats_get`.

Example:
+-------------------------------+---------------+---------------+--------------+
|Node                           |calls          |objs           |realloc_count |
+-------------------------------+---------------+---------------+--------------+
|ip4_lookup                     |1324083        |338965248      |2             |
|       ip4_lookup_error        |               |338965496      |              |
|pkt_drop                       |1324084        |338965504      |1             |
|ethdev_rx-0-0                  |1324086        |338966016      |2             |
|pkt_cls                        |1324086        |338966016      |1             |
+-------------------------------+---------------+---------------+--------------+

v2 Changes:
- Fix compilation.
v3 Changes:
- Resend as 1/5 didn't make it through.
v4 Changes:
- Address review comments.
- Rebase on main branch.
v5 Changes:
- Shrink structure member names.(Robin)
- add rte_node_error_increment utility function. (Robin)
- Squash patches. (Robin)
- Update RN, DN. (David)

Pavan Nikhilesh (3):
  graph: add support for node specific errors
  graph: add node error counters

I think patch 1/3 and 2/3 should be split differently. My preference would be to have documentation (especially large svg images) in a separate commit. Other than that, the changes in lib/graph should be squashed in the same patch.

  node: add error stats for ip4 nodes

 doc/guides/prog_guide/graph_lib.rst           |  22 +-
 .../prog_guide/img/anatomy_of_a_node.svg      | 329 +++++--
 .../prog_guide/img/graph_mem_layout.svg       | 921 +++++++++++++-----
 doc/guides/rel_notes/deprecation.rst          |   6 -
 doc/guides/rel_notes/release_24_11.rst        |   8 +
 lib/graph/graph_populate.c                    |  20 +-
 lib/graph/graph_private.h                     |   3 +
 lib/graph/graph_stats.c                       |  78 +-
 lib/graph/node.c                              |  37 +-
 lib/graph/rte_graph.h                         |  11 +
 lib/graph/rte_graph_worker_common.h           |  23 +
 lib/graph/version.map                         |   7 +
 lib/node/ip4_lookup.c                         |   9 +
 lib/node/ip4_lookup_neon.h                    |   5 +
 lib/node/ip4_lookup_sse.h                     |   6 +
 lib/node/ip4_reassembly.c                     |   9 +
 lib/node/node_private.h                       |   8 +
 17 files changed, 1192 insertions(+), 310 deletions(-)

To summarize changes, here is my proposal:

struct rte_node_xstats {
  uint8_t xstats_num; /**< Number of xstats. */
  char (*xstats_desc)[RTE_NODE_XSTATS_DESC_SIZE]; /**< Names of xstats. */
};

struct rte_node_register {
  ...
  const struct rte_node_xstats *xstats; /**< Node specific extra statistics. */
  ...
};

static inline void
rte_node_xstat_increment(struct rte_node *node, uint16_t stat_id, uint64_t 
value)
{
#ifdef RTE_LIBRTE_GRAPH_STATS
  uint64_t *errors = RTE_PTR_ADD(node, node->err_off);
  errors[err_id] += value;
#else
  RTE_SET_USED(node);
  RTE_SET_USED(err_id);
  RTE_SET_USED(value);
#endif
}

struct __rte_cache_aligned rte_graph_cluster_node_stats {
  ...
  uint8_t xstats_num;
  char (*xstats_desc)[RTE_NODE_XSTATS_DESC_SIZE];
  uint64_t *xstats_val;
  ...
};

Let me know what you think. Thanks!

Reply via email to