On Mon, Jun 02, 2025 at 03:56:26PM -0500, Ben Cheatham wrote:
> v2 Changes:
>       - Make the --clear option of 'inject-error' its own command (Alison)
>       - Debugfs is now found using the /proc/mount entry instead of
>       providing the path using a --debugfs option
>       - Man page added for 'clear-error'
>       - Reword commit descriptions for clarity
> 
> This series adds support for injecting CXL protocol (CXL.cache/mem)
> errors[1] into CXL RCH Downstream ports and VH root ports[2] and
> poison into CXL memory devices through the CXL debugfs. Errors are
> injected using a new 'inject-error' command, while errors are reported
> using a new cxl-list "-N"/"--injectable-errors" option. Device poison
> can be cleared using the 'clear-error' command.
> 
> The 'inject-error'/'clear-error' commands and "-N" option of cxl-list all
> require access to the CXL driver's debugfs.
> 
> The documentation for the new cxl-inject-error command shows both usage
> and the possible device/error types, as well as how to retrieve them
> using cxl-list. The documentation for cxl-list has also been updated to
> show the usage of the new injectable errors option.
> 
> [1]: ACPI v6.5 spec, section 18.6.4
> [2]: ACPI v6.5 spec, table 18.31
> 
> --
> 
> Alison, I reached out to Junhyeok about his poison injection series but
> never heard back, so I've just continued with my original plans for a
> v2.
> 
> Quick note: My testing setup is screwed up at the moment, so this
> revision is untested. I'll try to get it fixed for the next revision.

I applied this to v82 (needs a sync up in libcxl.sym) and ran cxl-poison unit
test using your new cxl-cli cmds instead of writing to debugfs directly.[1]
Works for me. Just thought I'd share that as proof of life until I review it
completely.

Adding more test cases to cxl-poison.sh makes sense for the device poison.
Wondering about the protocol errors. How do we test those?

[1] diff --git a/test/cxl-poison.sh b/test/cxl-poison.sh
index 6ed890bc666c..41ab670b1094 100644
--- a/test/cxl-poison.sh
+++ b/test/cxl-poison.sh
@@ -68,7 +68,8 @@ inject_poison_sysfs()
        memdev="$1"
        addr="$2"
 
-       echo "$addr" > /sys/kernel/debug/cxl/"$memdev"/inject_poison
+#      echo "$addr" > /sys/kernel/debug/cxl/"$memdev"/inject_poison
+       $CXL inject-error "$memdev" -t poison -a "$addr"
 }
 
 clear_poison_sysfs()
@@ -76,7 +77,8 @@ clear_poison_sysfs()
        memdev="$1"
        addr="$2"
 
-       echo "$addr" > /sys/kernel/debug/cxl/"$memdev"/clear_poison
+#      echo "$addr" > /sys/kernel/debug/cxl/"$memdev"/clear_poison
+       $CXL clear-error "$memdev" -a "$addr"
 }


While applying this: Documentation: Add docs for inject/clear-error commands
Got these whitespace complaints:
234: new blank line at EOF
158: space before tab in indent.
        "offset":"0x1000",
159: space before tab in indent.
        "length":64,
160: space before tab in indent.
        "source":"Injected"


-- snip


Reply via email to