From: Rob Clark <[email protected]> [ Upstream commit a814ba2d7b847cff15565bbab781df89e190619c ]
We weren't setting the # of captured debugbus blocks. Reported-by: Connor Abbott <[email protected]> Suggested-by: Connor Abbott <[email protected]> Signed-off-by: Rob Clark <[email protected]> Patchwork: https://patchwork.freedesktop.org/patch/666660/ Signed-off-by: Sasha Levin <[email protected]> --- LLM Generated explanations, may be completely bogus: **Backport Status: YES** This commit should be backported to stable kernel trees for the following reasons: ## Bug Impact Analysis 1. **Clear bug fix**: The code analysis shows that `a7xx_get_debugbus_blocks()` allocates memory for `total_debugbus_blocks` worth of debugbus data and populates all of them, but critically fails to set `a6xx_state->nr_debugbus = total_debugbus_blocks`. This is clearly visible at line 442 of the fixed code. 2. **User-visible impact**: Without setting `nr_debugbus`, the function `a6xx_show_debugbus()` at line 1949 will iterate 0 times (`for (i = 0; i < a6xx_state->nr_debugbus; i++)`), meaning NO debugbus data will be shown in GPU crash dumps for a7xx GPUs. This severely impacts debugging capabilities when GPU hangs or crashes occur. 3. **Regression timeline**: This bug was introduced in commit 64d6255650d4e0 ("drm/msm: More fully implement devcoredump for a7xx") from January 2024, which added the `a7xx_get_debugbus_blocks()` function but forgot to set the counter. The a6xx version of this function correctly sets `nr_debugbus` at lines 372 and 384. ## Stable Tree Criteria Met 1. **Real bug affecting users**: Yes - debugbus data is completely missing from a7xx GPU crash dumps 2. **Small and contained fix**: Yes - single line addition: `a6xx_state->nr_debugbus = total_debugbus_blocks;` 3. **No architectural changes**: The fix simply sets an existing counter variable that was forgotten 4. **Minimal regression risk**: The change only affects the specific code path for a7xx GPUs and simply enables already-allocated and populated data to be displayed 5. **Critical debugging functionality**: GPU crash dumps are essential for debugging graphics driver issues in production ## Technical Details The bug is in the a7xx-specific path where: - Memory is allocated for `total_debugbus_blocks` entries (line 426) - All blocks are populated via `a6xx_get_debugbus_block()` calls (lines 430-439) - But `nr_debugbus` is never set, leaving it at 0 - This causes `a6xx_show_debugbus()` to skip all debugbus output since it loops from 0 to `nr_debugbus` The fix correctly sets `nr_debugbus = total_debugbus_blocks` after populating all the data, matching the pattern used in the a6xx equivalent function. This is a perfect candidate for stable backporting as it fixes a clear functional regression in debugging infrastructure without any risk of destabilizing the system. drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c index a85d3df7a5fac..f46bc906ca2a3 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c @@ -423,8 +423,9 @@ static void a7xx_get_debugbus_blocks(struct msm_gpu *gpu, a6xx_state, &a7xx_debugbus_blocks[gbif_debugbus_blocks[i]], &a6xx_state->debugbus[i + debugbus_blocks_count]); } - } + a6xx_state->nr_debugbus = total_debugbus_blocks; + } } static void a6xx_get_debugbus(struct msm_gpu *gpu, -- 2.50.1
