On Wed, Dec 06, 2023 at 08:23:10PM -0800, Dan Williams wrote:
> alison.schofield@ wrote:
> > From: Alison Schofield <[email protected]>
> >
> > The --poison option to 'cxl list' retrieves poison lists from
> > memory devices supporting the capability and displays the
> > returned poison records in the cxl list json. This option can
> > apply to memdevs or regions.
> >
> > Example usage in the Documentation/cxl/cxl-list.txt update.
> >
> > Signed-off-by: Alison Schofield <[email protected]>
> > ---
> > Documentation/cxl/cxl-list.txt | 58 ++++++++++++++++++++++++++++++++++
> > cxl/filter.h | 3 ++
> > cxl/list.c | 2 ++
> > 3 files changed, 63 insertions(+)
> >
> > diff --git a/Documentation/cxl/cxl-list.txt b/Documentation/cxl/cxl-list.txt
> > index 838de4086678..ee2f1b2d9fae 100644
> > --- a/Documentation/cxl/cxl-list.txt
> > +++ b/Documentation/cxl/cxl-list.txt
> > @@ -415,6 +415,64 @@ OPTIONS
> > --region::
> > Specify CXL region device name(s), or device id(s), to filter the
> > listing.
> >
> > +-L::
> > +--poison::
> > + Include poison information. The poison list is retrieved from the
> > + device(s) and poison records are added to the listing. Apply this
> > + option to memdevs and regions where devices support the poison
> > + list capability.
>
> While CXL calls it "poison" I am not convinced that's the term that end
> users can universally use for this. This is why "ndctl list" uses -M,
> but yeah, -M and -P are already taken. Even -E is taken for "errors".
>
> > +
> > +----
> > +# cxl list -m mem11 --poison
> > +[
> > + {
> > + "memdev":"mem11",
> > + "pmem_size":268435456,
> > + "ram_size":0,
> > + "serial":0,
> > + "host":"0000:37:00.0",
> > + "poison":{
> > + "nr_records":1,
> > + "records":[
>
> One cleanup I want to see before this goes live... drop nr_records and
> just make "poison" an array object directly. The number of records is
> trivially determined by the jq "len" operator.
The poison nr_records count is a convenience for the cmdline user,
not for the user doing their own jq parsing. It also intends to
explicitly show nr_records:0 when no poison is found.
Say NAK again and I'll rm it all.
>
> Also, per above rename "poison" to "media_errors". I believe "poison" is
> an x86'ism where "media_error" is a more generic term.
OK
>
> > + {
> > + "dpa":0,
> > + "dpa_length":64,
> > + "source":"Internal",
> > + }
> > + ]
> > + }
> > + }
> > +]
> > +# cxl list -r region5 --poison
> > +[
> > + {
> > + "region":"region5",
> > + "resource":1035623989248,
> > + "size":2147483648,
> > + "interleave_ways":2,
> > + "interleave_granularity":4096,
> > + "decode_state":"commit",
> > + "poison":{
> > + "nr_records":2,
> > + "records":[
> > + {
> > + "memdev":"mem2",
> > + "dpa":0,
> > + "dpa_length":64,
>
> Does length need to be prefixed with "dpa_"?
>
Nope. Will remove.
> > + "source":"Internal",
>
> I am not sure what the end user can do with "source"? I have tended to
> not emit things if I can't think of a use case for the field to be
> there.
Erwin followed w a comment on this one. Yeah, I think folks want to
know if error was injected at least.
Alison