Re: [PATCH v3] eal/linux: fix fbarray name with multiple secondary processes

Stephen Hemminger Wed, 14 Jan 2026 03:45:00 -0800

On Sat, 16 Nov 2024 12:07:41 +0800
Congjie Zhou <[email protected]> wrote:


> add the tsc into the name.
> 
> Suggested-by: Stephen Hemminger <[email protected]>
> Signed-off-by: Congjie Zhou <[email protected]>
> ---

This patch seems to have gotten stuck in the unreviewed dead pile.
Looking at it, it addresses a potential problem but there are issues.
Please consider what this AI review feedback had to say and resubmit

Now let me analyze this patch against the AGENTS.md criteria and evaluate the 
technical correctness.

## Evaluation Against AGENTS.md

### Commit Message Issues

| Check | Status | Details |
|-------|--------|---------|
| Subject ≤60 chars | ✅ | 56 characters |
| Correct prefix | ✅ | `eal/linux:` is correct |
| Lowercase after colon | ✅ | "fix fbarray..." |
| Imperative mood | ✅ | "fix" |
| No trailing period | ✅ | |
| Signed-off-by present | ✅ | |
| Body lines ≤75 chars | ✅ | |

**Warning**: The commit body is weak - "add the tsc into the name." is very 
terse and doesn't explain the *problem* being solved. The actual explanation is 
after `---` which won't appear in git history. The body should explain the 
container PID namespace issue before the Signed-off-by.

### Code Style

| Check | Status | Details |
|-------|--------|---------|
| Include order | ✅ | `inttypes.h` with system includes, `rte_cycles.h` with 
DPDK includes |
| PRIx64 format | ✅ | Correct portable format specifier |
| Line length | ✅ | Within 100 chars |

---

## Technical Correctness Analysis

### The Problem
In Linux containers with separate PID namespaces, each container's PID 
numbering starts from 1. If multiple secondary DPDK processes in different 
containers share the same hugetlbfs mount, the fbarray names (which include 
PID) can collide.

### Issues with the TSC Solution

1. **TSC is not reliably unique**: Two containers starting at nearly the same 
instant on the same CPU could get very close or identical TSC values. The TSC 
counter increments at GHz frequencies, but container startup can be 
synchronized at millisecond granularity.

2. **TSC synchronization issues**: On multi-socket systems, TSC may not be 
synchronized across sockets/CPUs. While modern CPUs have `constant_tsc` and 
`nonstop_tsc`, this isn't universal.

3. **Overhead**: `rte_get_tsc_cycles()` is very cheap, but it adds a dependency 
on the cycles subsystem being initialized.

4. **Name length**: Adding a 16-character hex value to an already 
length-limited name (RTE_FBARRAY_NAME_LEN) could cause truncation issues.

---

## Simpler Alternative Solutions

### Option 1: Use `rte_rand()` (Recommended)

```c
#include <rte_random.h>

snprintf(name, RTE_FBARRAY_NAME_LEN, "%s_%x",
    primary_msl->memseg_arr.name, rte_rand());
```

**Advantages**:
- Cryptographically random, much lower collision probability
- Already seeded during EAL init
- No timing dependencies
- Simpler code (no PID needed at all)

### Option 2: Use the runtime directory uniqueness

The **proper solution** for containers is to use `--file-prefix` which already 
exists. Each container should specify a unique file prefix, making this problem 
disappear entirely. This is a documentation/deployment issue, not a code issue.

### Option 3: Combine with monotonic clock nanoseconds

```c
struct timespec ts;
clock_gettime(CLOCK_MONOTONIC, &ts);
snprintf(name, RTE_FBARRAY_NAME_LEN, "%s_%lx",
    primary_msl->memseg_arr.name, 
    (unsigned long)(ts.tv_sec * 1000000000UL + ts.tv_nsec));
```

This was v1 of the patch and is actually fine - CLOCK_MONOTONIC with nanosecond 
resolution within a container's lifetime is effectively unique.

---

## Verdict

**The patch is technically functional but the solution is suboptimal.** 

The TSC approach works in practice but has theoretical collision risks. A 
cleaner approach would be:

1. **Best**: Document that containers sharing hugetlbfs should use 
`--file-prefix`
2. **Code fix if needed**: Use `rte_rand()` instead of TSC - it's simpler and 
provides stronger uniqueness guarantees

Would you like me to draft an alternative patch using `rte_rand()`?

Re: [PATCH v3] eal/linux: fix fbarray name with multiple secondary processes

Reply via email to