From: Rob Clark <[email protected]>

[ Upstream commit a814ba2d7b847cff15565bbab781df89e190619c ]

We weren't setting the # of captured debugbus blocks.

Reported-by: Connor Abbott <[email protected]>
Suggested-by: Connor Abbott <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
Patchwork: https://patchwork.freedesktop.org/patch/666660/
Signed-off-by: Sasha Levin <[email protected]>
---

LLM Generated explanations, may be completely bogus:

**Backport Status: YES**

This commit should be backported to stable kernel trees for the
following reasons:

## Bug Impact Analysis

1. **Clear bug fix**: The code analysis shows that
   `a7xx_get_debugbus_blocks()` allocates memory for
   `total_debugbus_blocks` worth of debugbus data and populates all of
   them, but critically fails to set `a6xx_state->nr_debugbus =
   total_debugbus_blocks`. This is clearly visible at line 442 of the
   fixed code.

2. **User-visible impact**: Without setting `nr_debugbus`, the function
   `a6xx_show_debugbus()` at line 1949 will iterate 0 times (`for (i =
   0; i < a6xx_state->nr_debugbus; i++)`), meaning NO debugbus data will
   be shown in GPU crash dumps for a7xx GPUs. This severely impacts
   debugging capabilities when GPU hangs or crashes occur.

3. **Regression timeline**: This bug was introduced in commit
   64d6255650d4e0 ("drm/msm: More fully implement devcoredump for a7xx")
   from January 2024, which added the `a7xx_get_debugbus_blocks()`
   function but forgot to set the counter. The a6xx version of this
   function correctly sets `nr_debugbus` at lines 372 and 384.

## Stable Tree Criteria Met

1. **Real bug affecting users**: Yes - debugbus data is completely
   missing from a7xx GPU crash dumps
2. **Small and contained fix**: Yes - single line addition:
   `a6xx_state->nr_debugbus = total_debugbus_blocks;`
3. **No architectural changes**: The fix simply sets an existing counter
   variable that was forgotten
4. **Minimal regression risk**: The change only affects the specific
   code path for a7xx GPUs and simply enables already-allocated and
   populated data to be displayed
5. **Critical debugging functionality**: GPU crash dumps are essential
   for debugging graphics driver issues in production

## Technical Details

The bug is in the a7xx-specific path where:
- Memory is allocated for `total_debugbus_blocks` entries (line 426)
- All blocks are populated via `a6xx_get_debugbus_block()` calls (lines
  430-439)
- But `nr_debugbus` is never set, leaving it at 0
- This causes `a6xx_show_debugbus()` to skip all debugbus output since
  it loops from 0 to `nr_debugbus`

The fix correctly sets `nr_debugbus = total_debugbus_blocks` after
populating all the data, matching the pattern used in the a6xx
equivalent function.

This is a perfect candidate for stable backporting as it fixes a clear
functional regression in debugging infrastructure without any risk of
destabilizing the system.

 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index a85d3df7a5fac..f46bc906ca2a3 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -423,8 +423,9 @@ static void a7xx_get_debugbus_blocks(struct msm_gpu *gpu,
                                a6xx_state, 
&a7xx_debugbus_blocks[gbif_debugbus_blocks[i]],
                                &a6xx_state->debugbus[i + 
debugbus_blocks_count]);
                }
-       }
 
+               a6xx_state->nr_debugbus = total_debugbus_blocks;
+       }
 }
 
 static void a6xx_get_debugbus(struct msm_gpu *gpu,
-- 
2.50.1

Reply via email to