On Wed, 14 Jan 2026 22:18:07 -0800
Dimon Zhao <[email protected]> wrote:

> The timer interrupt handler periodically queries and updates
> hw_stats via mailbox.
> Since mailbox operations rely on interrupts for packet reception,
> this causes interrupt nesting.
> To resolve this, trigger a task from the interrupt handler and
> start a dedicated thread to execute this task,
> eliminating the nested interrupt scenario.
> 
> Fixes: c9726a719ca1 ("net/nbl: support dropped packets counter")
> Cc: [email protected]
> 
> Signed-off-by: Dimon Zhao <[email protected]>
> ---

Queued to next-net.

You may want to consider the suggestions from AI patch review

### ✅ COMMIT MESSAGE

**Subject Line** (Line 29):
- Length: 53 characters ✓ (limit: 60)
- Format: `net/nbl: fix hardware stats interrupt nesting issue` ✓
- Lowercase with imperative mood ✓
- No trailing period ✓
- Correct component prefix ✓

**Body** (Lines 50-56):
- Lines within 75 character limit ✓
- Doesn't start with "It" ✓
- Clearly explains the problem: interrupt nesting from mailbox operations ✓
- Describes solution: dedicated thread eliminates nested interrupts ✓
- Good technical rationale ✓

**Tags** (Lines 58-61):
- `Fixes:` tag with 12-char SHA and exact subject ✓
- `Cc: [email protected]` present ✓
- `Signed-off-by:` with full name and email ✓
- Correct tag order ✓

### ✅ CODE CHANGES

**Architecture**: Producer-consumer pattern using pipe for interrupt-to-thread 
communication - solid design ✓

**nbl_dev_thread_hw_stats_task()** (Lines 75-93):
- Worker thread that reads from pipe and updates stats
- Uses `read()` with proper return value checking (line 86-87) ✓
- Clean loop termination on pipe closure ✓

**nbl_dev_update_hw_stats_handler()** (Lines 95-109):
- Interrupt handler now writes to pipe instead of direct mailbox call ✓
- **Line 106-107**: `write()` return value captured but marked unused with 
`RTE_SET_USED(nw)`

**nbl_dev_hw_stats_start()** (Lines 213-229):
- Creates pipe with error checking ✓
- Spawns thread with proper error handling ✓
- Cleans up pipe on thread creation failure (lines 124-127) ✓
- Sets fd to -1 on error - good defensive programming ✓

**nbl_dev_hw_stats_stop()** (Lines 336-350):
- Cancels alarm ✓
- Closes pipe to signal thread exit ✓
- **Line 145-146**: `close()` calls without return value checks
- **Line 149**: `rte_thread_join()` without return value check
- Sets fd to -1 after close - good practice ✓

### ⚠️ WARNINGS

**Warning 1**: Line 106-107 - Write return value handling
```c
nw = write(net_dev->fd[1], &notify_byte, 1);
RTE_SET_USED(nw);
```
**Issue**: In interrupt context, the pipe write is intentionally best-effort 
(if write fails, one stats update is skipped but alarm reschedules). However, 
this should have a comment explaining the intentional ignore, or consider at 
least logging on failure if safe in interrupt context.

**Suggested improvement**:
```c
/* Best-effort notification; if write fails, stats update skipped this cycle */
nw = write(net_dev->fd[1], &notify_byte, 1);
if (unlikely(nw != 1))
    NBL_LOG(DEBUG, "hw_stats notification write failed");
```

**Warning 2**: Line 149 - Thread join without error check
```c
rte_thread_join(net_dev->tid, NULL);
```
**Issue**: Return value should be checked per AGENTS.md guidelines on checking 
system calls.

**Suggested improvement**:
```c
ret = rte_thread_join(net_dev->tid, NULL);
if (ret)
    NBL_LOG(ERR, "hw_stats thread join failed, ret %d", ret);
```

**Warning 3**: Lines 145-146 - Close without error check
```c
close(net_dev->fd[0]);
close(net_dev->fd[1]);
```
**Issue**: While errors on `close()` in cleanup paths are typically non-fatal, 
AGENTS.md emphasizes checking return values. At minimum, could check for -1 and 
log.

### ✅ CODE STYLE

- Line length under 100 characters ✓
- Proper indentation ✓
- No trailing whitespace visible ✓
- Variable naming conventions followed ✓
- Comment on line 144 uses correct style ✓

### 📋 FINAL ASSESSMENT

**Errors:** None  
**Warnings:** 3 (return value checking in edge cases)  
**Info:** Excellent architectural solution to interrupt nesting problem

The patch solves a real concurrency issue with a clean producer-consumer 
design. The interrupt-to-thread communication via pipe is the right approach. 
The warnings are minor - mostly about defensive error checking in contexts 
where failures are unlikely or non-fatal.

**Recommendation:** ✅ **APPROVED with minor suggestions**

The patch is fundamentally sound and ready for merge. The suggested 
improvements around return value checking would make it even more robust, but 
the current implementation is acceptable given that:
1. Pipe write failure in interrupt context has limited recovery options
2. Thread join failure during cleanup is rare
3. Close failures are typically informational

**Priority for improvements**: Low - consider for follow-up polish but not 
blocking.

Reply via email to