On Thu,  9 Oct 2025 18:55:36 +0800 Jinchao Wang <[email protected]> 
wrote:

> This patch series introduces KStackWatch, a lightweight debugging tool to 
> detect
> kernel stack corruption in real time. It installs a hardware breakpoint
> (watchpoint) at a function's specified offset using `kprobe.post_handler` and
> removes it in `fprobe.exit_handler`. This covers the full execution window and
> reports corruption immediately with time, location, and a call stack.
> 
> The motivation comes from scenarios where corruption occurs silently in one
> function but manifests later in another, without a direct call trace linking
> the two. Such bugs are often extremely hard to debug with existing tools.
> These scenarios are demonstrated in test 3–5 (silent corruption test, patch 
> 20).
> 
> ...
>
>  20 files changed, 1809 insertions(+), 62 deletions(-)

It's obviously a substantial project.  We need to decide whether to add
this to Linux.

There are some really important [0/N] changelog details which I'm not
immediately seeing:

Am I correct in thinking that it's x86-only?  If so, what's involved in
enabling other architectures?  Is there any such work in progress?

What motivated the work?  Was there some particular class of failures
which you were persistently seeing and wished to fix more efficiently?

Has this code (or something like it) been used in production systems? 
If so, by whom and with what results?

Has it actually found some kernel bugs yet?  If so, details please.

Can this be enabled on production systems?  If so, what is the
measured runtime overhead?

Reply via email to