On Mon, Mar 18, 2024 at 02:01:38PM -0700, Dan Williams wrote:
> Alison Schofield wrote:
> > On Mon, Mar 18, 2024 at 10:51:13AM -0700, fan wrote:
> > > On Wed, Mar 13, 2024 at 09:05:17PM -0700, alison.schofi...@intel.com 
> > > wrote:
> > > > From: Alison Schofield <alison.schofi...@intel.com>
> > > > 
> > > > CXL devices maintain a list of locations that are poisoned or result
> > > > in poison if the addresses are accessed by the host.
> > > > 
> > > > Per the spec (CXL 3.1 8.2.9.9.4.1), the device returns the Poison
> > > > List as a set of  Media Error Records that include the source of the
> > > > error, the starting device physical address and length.
> > > > 
> > > > Trigger the retrieval of the poison list by writing to the memory
> > > > device sysfs attribute: trigger_poison_list. The CXL driver only
> > > > offers triggering per memdev, so the trigger by region interface
> > > > offered here is a convenience API that triggers a poison list
> > > > retrieval for each memdev contributing to a region.
> > > > 
> > > > int cxl_memdev_trigger_poison_list(struct cxl_memdev *memdev);
> > > > int cxl_region_trigger_poison_list(struct cxl_region *region);
> > > > 
> > > > The resulting poison records are logged as kernel trace events
> > > > named 'cxl_poison'.
> > > > 
> > > > Signed-off-by: Alison Schofield <alison.schofi...@intel.com>
> > > > Reviewed-by: Dave Jiang <dave.ji...@intel.com>
> > > > ---
> > > >  cxl/lib/libcxl.c   | 47 ++++++++++++++++++++++++++++++++++++++++++++++
> > > >  cxl/lib/libcxl.sym |  2 ++
> > > >  cxl/libcxl.h       |  2 ++
> > > >  3 files changed, 51 insertions(+)
> > > > 
> > > > diff --git a/cxl/lib/libcxl.c b/cxl/lib/libcxl.c
> > > > index ff27cdf7c44a..73db8f15c704 100644
> > > > --- a/cxl/lib/libcxl.c
> > > > +++ b/cxl/lib/libcxl.c
> > > > @@ -1761,6 +1761,53 @@ CXL_EXPORT int 
> > > > cxl_memdev_disable_invalidate(struct cxl_memdev *memdev)
> > > >         return 0;
> > > >  }
> > > >  
> > > > +CXL_EXPORT int cxl_memdev_trigger_poison_list(struct cxl_memdev 
> > > > *memdev)
> > > > +{
> > > > +       struct cxl_ctx *ctx = cxl_memdev_get_ctx(memdev);
> > > > +       char *path = memdev->dev_buf;
> > > > +       int len = memdev->buf_len, rc;
> > > > +
> > > > +       if (snprintf(path, len, "%s/trigger_poison_list",
> > > > +                    memdev->dev_path) >= len) {
> > > > +               err(ctx, "%s: buffer too small\n",
> > > > +                   cxl_memdev_get_devname(memdev));
> > > > +               return -ENXIO;
> > > > +       }
> > > > +       rc = sysfs_write_attr(ctx, path, "1\n");
> > > > +       if (rc < 0) {
> > > > +               fprintf(stderr,
> > > > +                       "%s: Failed write sysfs attr 
> > > > trigger_poison_list\n",
> > > > +                       cxl_memdev_get_devname(memdev));
> > > 
> > > Should we use err() instead of fprintf here? 
> > 
> > Thanks Fan,
> > 
> > How about this?
> > 
> > - use fprintf if access() fails, ie device doesn't support poison list,
> > - use err() for failure to actually read the poison list on a device with
> >   support
> 
> Why? There is no raw usage of fprintf in any of the libraries (ndctl,
> daxctl, cxl) to date. If someone builds the library without logging then
> it should not chat on stderr at all, and if someone redirects logging to
> syslog then it also should emit messages only there and not stderr.

Why indeed :(

I'll remove the fprintf() and only use err() for both cases: device
doesn't support feature, or failure to read list.


Reply via email to