On Mon, 17 Mar 2025 14:46:07 +0100 Jan Viktorin <[email protected]> wrote:
> Hello Vipin and others, > > please, will there be any progress or update on this series? > > I successfully tested those changes on our Intel and AMD machines and > would like to use it in production soon. > > The API is a little bit unintuitive, at least for me, but I > successfully integrated into our software. > > I am missing a clear relation to the NUMA socket approach used in DPDK. > E.g. I would like to be able to easily walk over a list of lcores from > a specific NUMA node grouped by L3 domain. Yes, there is the > RTE_LCORE_DOMAIN_IO, but would it always match the appropriate socket > IDs? > > Also, I do not clearly understand what is the purpose of using domain > selector like: > > RTE_LCORE_DOMAIN_L1 | RTE_LCORE_DOMAIN_L2 > > or even: > > RTE_LCORE_DOMAIN_L3 | RTE_LCORE_DOMAIN_L2 > > the documentation does not explain this. I could not spot any kind of > grouping that would help me in any way. Some "best practices" examples > would be nice to have to understand the intentions better. > > I found a little catch when running DPDK with more lcores than there > are physical or SMT CPU cores. This happens when using e.g. an option > like --lcores=(0-15)@(0-1). The results from the topology API would not > match the lcores because hwloc is not aware of the lcores concept. This > might be mentioned somewhere. > > Anyway, I really appreciate this work and would like to see it upstream. > Especially for AMD machines, some framework like this is a must. > > Kind regards, > Jan > > On Tue, 5 Nov 2024 15:58:45 +0530 > Vipin Varghese <[email protected]> wrote: > > > This patch introduces improvements for NUMA topology awareness in > > relation to DPDK logical cores. The goal is to expose API which allows > > users to select optimal logical cores for any application. These > > logical cores can be selected from various NUMA domains like CPU and > > I/O. > > > > Change Summary: > > - Introduces the concept of NUMA domain partitioning based on CPU and > > I/O topology. > > - Adds support for grouping DPDK logical cores within the same Cache > > and I/O domain for improved locality. > > - Implements topology detection and core grouping logic that > > distinguishes between the following NUMA configurations: > > * CPU topology & I/O topology (e.g., AMD SoC EPYC, Intel Xeon SPR) > > * CPU+I/O topology (e.g., Ampere One with SLC, Intel Xeon SPR > > with SNC) > > - Enhances performance by minimizing lcore dispersion across > > tiles|compute package with different L2/L3 cache or IO domains. > > > > Reason: > > - Applications using DPDK libraries relies on consistent memory > > access. > > - Lcores being closer to same NUMA domain as IO. > > - Lcores sharing same cache. > > > > Latency is minimized by using lcores that share the same NUMA > > topology. Memory access is optimized by utilizing cores within the > > same NUMA domain or tile. Cache coherence is preserved within the > > same shared cache domain, reducing the remote access from > > tile|compute package via snooping (local hit in either L2 or L3 > > within same NUMA domain). > > > > Library dependency: hwloc > > > > Topology Flags: > > --------------- > > - RTE_LCORE_DOMAIN_L1: to group cores sharing same L1 cache > > - RTE_LCORE_DOMAIN_SMT: same as RTE_LCORE_DOMAIN_L1 > > - RTE_LCORE_DOMAIN_L2: group cores sharing same L2 cache > > - RTE_LCORE_DOMAIN_L3: group cores sharing same L3 cache > > - RTE_LCORE_DOMAIN_L4: group cores sharing same L4 cache > > - RTE_LCORE_DOMAIN_IO: group cores sharing same IO > > > > < Function: Purpose > > > --------------------- > > - rte_get_domain_count: get domain count based on Topology Flag > > - rte_lcore_count_from_domain: get valid lcores count under each > > domain > > - rte_get_lcore_in_domain: valid lcore id based on index > > - rte_lcore_cpuset_in_domain: return valid cpuset based on index > > - rte_lcore_is_main_in_domain: return true|false if main lcore is > > present > > - rte_get_next_lcore_from_domain: next valid lcore within domain > > - rte_get_next_lcore_from_next_domain: next valid lcore from next > > domain > > > > Note: > > 1. Topology is NUMA grouping. > > 2. Domain is various sub-groups within a specific Topology. > > > > Topology example: L1, L2, L3, L4, IO > > Domian example: IO-A, IO-B > > > > < MACRO: Purpose > > > ------------------ > > - RTE_LCORE_FOREACH_DOMAIN: iterate lcores from all domains > > - RTE_LCORE_FOREACH_WORKER_DOMAIN: iterate worker lcores from all > > domains > > - RTE_LCORE_FORN_NEXT_DOMAIN: iterate domain select n'th lcore > > - RTE_LCORE_FORN_WORKER_NEXT_DOMAIN: iterate domain for worker n'th > > lcore. > > > > Future work (after merge): > > -------------------------- > > - dma-perf per IO NUMA > > - eventdev per L3 NUMA > > - pipeline per SMT|L3 NUMA > > - distributor per L3 for Port-Queue > > - l2fwd-power per SMT > > - testpmd option for IO NUMA per port > > > > Platform tested on: > > ------------------- > > - INTEL(R) XEON(R) PLATINUM 8562Y+ (support IO numa 1 & 2) > > - AMD EPYC 8534P (supports IO numa 1 & 2) > > - AMD EPYC 9554 (supports IO numa 1, 2, 4) > > > > Logs: > > ----- > > 1. INTEL(R) XEON(R) PLATINUM 8562Y+: > > - SNC=1 > > Domain (IO): at index (0) there are 48 core, with (0) at > > index 0 > > - SNC=2 > > Domain (IO): at index (0) there are 24 core, with (0) at > > index 0 Domain (IO): at index (1) there are 24 core, with (12) at > > index 0 > > > > 2. AMD EPYC 8534P: > > - NPS=1: > > Domain (IO): at index (0) there are 128 core, with (0) at > > index 0 > > - NPS=2: > > Domain (IO): at index (0) there are 64 core, with (0) at > > index 0 Domain (IO): at index (1) there are 64 core, with (32) at > > index 0 > > > > Signed-off-by: Vipin Varghese <[email protected]> > > > > Vipin Varghese (4): > > eal/lcore: add topology based functions > > test/lcore: enable tests for topology > > doc: add topology grouping details > > examples: update with lcore topology API > > > > app/test/test_lcores.c | 528 +++++++++++++ > > config/meson.build | 18 + > > .../prog_guide/env_abstraction_layer.rst | 22 + > > examples/helloworld/main.c | 154 +++- > > examples/l2fwd/main.c | 56 +- > > examples/skeleton/basicfwd.c | 22 + > > lib/eal/common/eal_common_lcore.c | 714 > > ++++++++++++++++++ lib/eal/common/eal_private.h | > > 58 ++ lib/eal/freebsd/eal.c | 10 + > > lib/eal/include/rte_lcore.h | 209 +++++ > > lib/eal/linux/eal.c | 11 + > > lib/eal/meson.build | 4 + > > lib/eal/version.map | 11 + > > lib/eal/windows/eal.c | 12 + > > 14 files changed, 1819 insertions(+), 10 deletions(-) > > This patch series does not apply cleanly to current DPDK main branch. Please rebase and resubmit. AI patch review had the following insights: # DPDK Patch Series Review: Topology-based Lcore Functions (v4) **Series**: [PATCH v4 1-4/4] eal/lcore: add topology based functions **Date**: November 5, 2024 **Author**: Vipin Varghese <[email protected]> --- ## Executive Summary This 4-patch series adds topology-aware lcore mapping to DPDK's lcore API, allowing cores to be grouped by chiplet locality (L1-L4 cache, IO). The series includes API additions, tests, documentation, and example updates. **Overall Assessment**: The patches have several issues that need to be addressed before acceptance. --- ## Patch 1/4: eal/lcore: add topology based functions ### Commit Message Review #### ✅ PASS: Subject Line Format - Length: 45 characters (within 60 char limit) - Format: `eal/lcore: add topology based functions` - Proper prefix, lowercase, imperative mood, no trailing period #### ⚠️ WARNING: Commit Body Issues **Line Length Violations** (must be ≤75 characters): - Line 147: "Using hwloc library, the dpdk available lcores can be grouped" - appears OK - Line 148: "into various groups nameley L1, L2, L3, L4 and IO. This patch" - OK - However, several lines approach the limit **Typo**: "nameley" should be "namely" (line 147) **Body Structure**: The commit message starts appropriately but could be more detailed about the problem being solved and why this change is needed. #### ❌ ERROR: Missing Required Tags The patch is missing: - `Signed-off-by:` tag - This is **MANDATORY** per AGENTS.md line 116 The version history (v4 changes) is good and appropriately placed but should be separated from the main commit message by a `---` line before the diff. ### Code Review #### License Headers Need to check if new files have proper SPDX headers. From the visible diff, additions to existing files should maintain their existing headers. #### Naming Conventions **✅ PASS**: External API functions properly prefixed with `rte_`: - `rte_get_domain_count` - `rte_get_lcore_in_domain` - `rte_get_next_lcore_from_domain` - `rte_get_next_lcore_from_next_domain` - `rte_lcore_count_from_domain` - `rte_lcore_cpuset_in_domain` - `rte_lcore_is_main_in_domain` **⚠️ INFO**: Internal functions lack `rte_` prefix, which is appropriate for internal APIs: - `get_domain_lcore_count` - `get_domain_lcore_mapping` - `rte_eal_topology_init` - `rte_eal_topology_release` However, the last two (`rte_eal_topology_*`) have `rte_` prefix despite being listed as internal - verify if these should be marked as `__rte_internal`. #### Code Style Issues From the visible diffs: **Line Length**: Need to verify all code lines are ≤100 characters **Type Usage**: - Good: Conversion of `l3_count` & `io_count` to `uint16_t` (mentioned in v4 changes) **Memory Management**: - Good: Removed unnecessary NULL checks before free (mentioned in v4 changes) - Good: Removed unnecessary malloc casting (mentioned in v4 changes) **Comments**: Would need to see full code to verify comment formatting #### API Design Issues **⚠️ WARNING: Experimental APIs** All new external APIs should be marked with `__rte_experimental` per AGENTS.md line 766. The commit message mentions "External Experimental API" but need to verify in the actual header file that each function has the `__rte_experimental` attribute on its own line. **⚠️ WARNING: Missing Doxygen** New public APIs must have Doxygen comments (AGENTS.md line 756). Need to verify full header file. #### Testing Requirements **⚠️ WARNING**: Per AGENTS.md line 698-699: - New APIs must be used in `/app` test directory - New device APIs require at least one driver implementation The series includes patch 2/4 for tests, which is good. Need to verify the tests use the `TEST_ASSERT` macros and `unit_test_suite_runner` infrastructure (AGENTS.md lines 703-743). ### Documentation Requirements **⚠️ WARNING**: Per AGENTS.md line 758-762: - Release notes must be updated in `doc/guides/rel_notes/` for important changes - Code and documentation must be updated atomically The series includes patch 3/4 for documentation, which is good. However, need to verify: - Documentation matches code behavior - Only the **current release** notes file is updated - Doxygen comments are present for all public APIs --- ## Patch 2/4: test/lcore: enable tests for topology ### Commit Message Review #### Subject Line - Format: `test/lcore: enable tests for topology` - Length: 37 characters ✅ - Proper prefix, lowercase, imperative ✅ #### Missing Elements - ❌ **ERROR**: Missing `Signed-off-by:` tag ### Test Code Requirements **Need to verify**: - Tests use `TEST_ASSERT` macros (AGENTS.md line 745-752) - Tests use `unit_test_suite_runner` infrastructure (AGENTS.md line 703-743) - Tests are properly registered with `REGISTER_FAST_TEST` or similar --- ## Patch 3/4: doc: add topology grouping details ### Commit Message Review #### Subject Line - Format: `doc: add topology grouping details` - Length: 35 characters ✅ - Proper prefix, lowercase, imperative ✅ #### Missing Elements - ❌ **ERROR**: Missing `Signed-off-by:` tag ### Documentation Requirements **Need to verify**: - Documentation matches actual code behavior - Release notes updated for current release only - Proper RST formatting - No passive voice (per DPDK documentation standards) --- ## Patch 4/4: examples: update with lcore topology API ### Commit Message Review #### Subject Line - Format: `examples: update with lcore topology API` - Length: 39 characters ✅ - Proper prefix ✅ **⚠️ WARNING: Component Prefix** Per AGENTS.md line 89, `example:` should be `examples/foo:` with specific example name. Should be something like: `examples/helloworld: add topology support` #### Missing Elements - ❌ **ERROR**: Missing `Signed-off-by:` tag ### Code Review for Examples From the visible diff in helloworld: **Line 2608-2609**: Ternary operator formatting ```c rte_eal_remote_launch((topo_sel == USE_NO_TOPOLOGY) ? lcore_hello : send_lcore_hello, NULL, lcore_id); ``` This is acceptable but could be more readable. Consider: ```c lcore_func = (topo_sel == USE_NO_TOPOLOGY) ? lcore_hello : send_lcore_hello; rte_eal_remote_launch(lcore_func, NULL, lcore_id); ``` **Line 2631-2632**: Comment formatting ```c +/* select lcores based on ports numa (RTE_LCORE_DOMAIN_IO). */ +static bool select_port_from_io_domain; ``` ✅ Good comment style **L2fwd example changes** (lines 2624-2729): **Line 2641**: Missing space in comment ```c + " -t : Enable IO domain lcores mapping to Ports\n" ``` Should probably be: "IO-domain lcore mapping" or "IO domain lcore mapping" **Line 2670**: Variable type change ```c - unsigned lcore_id, rx_lcore_id; + uint16_t lcore_id, rx_lcore_id; ``` ✅ Good: Using explicit types instead of `unsigned` **Line 2681-2686**: Array initialization ```c + uint16_t coreindx_io_domain[RTE_MAX_ETHPORTS] = {0}; + uint16_t lcore_io_domain[RTE_MAX_ETHPORTS] = {RTE_MAX_LCORE}; ``` **⚠️ WARNING**: The second initialization is problematic. It only sets the first element to `RTE_MAX_LCORE`, not all elements. Should use a loop or memset for initialization: ```c for (int i = 0; i < RTE_MAX_ETHPORTS; i++) lcore_io_domain[i] = RTE_MAX_LCORE; ``` **Line 2683**: Variable naming issue ```c + uint16_t l3_domain_count = rte_get_domain_count(RTE_LCORE_DOMAIN_IO); ``` **❌ ERROR**: Variable is named `l3_domain_count` but gets IO domain count. Should be `io_domain_count`. **Line 2705-2710**: Parentheses in while condition ```c while ((rte_lcore_is_enabled(rx_lcore_id) == 0) || (lcore_queue_conf[rx_lcore_id].n_rx_port == l2fwd_rx_queue_per_lcore)) { ``` Excessive parentheses - the outer ones around each condition aren't needed. **Skeleton example changes** (lines 2731-2777): **Line 2750**: Typo in comment ```c + /* get the socekt of each port */ ``` Should be "socket", not "socekt" **Line 2760**: Function usage looks good ```c + core_count_from_io = rte_lcore_count_from_domain(RTE_LCORE_DOMAIN_IO, sel_io_indx); ``` --- ## Summary of Critical Issues ### Errors (Must Fix): 1. **All 4 patches missing `Signed-off-by:` tags** - This is mandatory per DPDK contribution guidelines 2. **Patch 4**: Variable naming bug - `l3_domain_count` should be `io_domain_count` 3. **Patch 4**: Array initialization bug - `lcore_io_domain` array not fully initialized ### Warnings (Should Fix): 4. **Patch 1**: Typo in commit message - "nameley" → "namely" 5. **Patch 4**: Wrong component prefix - should be `examples/helloworld:` and `examples/l2fwd:` and `examples/skeleton:` 6. **Patch 4**: Typo in comment - "socekt" → "socket" 7. **Patch 4**: Unnecessary parentheses in while condition 8. All patches should verify: - API functions have `__rte_experimental` markers - Doxygen comments present for all public APIs - Tests use proper `TEST_ASSERT` infrastructure - Release notes updated ### Info (Consider): 9. **Patch 1**: Version history (v4 changes) should be after `---` separator, not in commit body 10. **Patch 4**: Ternary operator could be more readable with intermediate variable 11. Verify all code lines are ≤100 characters 12. Verify commit body lines are ≤75 characters --- ## Recommended Actions 1. **Immediate**: Add `Signed-off-by:` tags to all patches with author's real name and email 2. **Immediate**: Fix variable naming bug (`l3_domain_count` → `io_domain_count`) 3. **Immediate**: Fix array initialization bug for `lcore_io_domain` 4. Fix typos: "nameley" → "namely", "socekt" → "socket" 5. Update patch 4 subject lines to use specific example names 6. Verify all new APIs are marked `__rte_experimental` 7. Verify all new APIs have Doxygen documentation 8. Move v4 changelog to post-`---` section 9. Run DPDK validation tools: - `devtools/check-git-log.sh` - `devtools/checkpatches.sh` - `devtools/test-meson-builds.sh` --- ## Validation Commands Before resubmitting, run: ```bash # Check commit messages devtools/check-git-log.sh # Check patch formatting devtools/checkpatches.sh *.patch # Verify compilation devtools/test-meson-builds.sh # Check maintainers devtools/get-maintainer.sh *.patch ``` --- ## Positive Aspects ✅ Good responsiveness to v4 feedback (removed malloc casting, NULL checks, fixed types) ✅ Comprehensive series with tests, documentation, and examples ✅ Clear API naming with proper `rte_` prefixes ✅ Good use of explicit types (`uint16_t`) instead of `unsigned` ✅ Appropriate subject line formatting and component prefixes (mostly) The core concept and implementation approach appear sound, but the patches need the above corrections before merging.

