On Mon, 19 May 2025, Bjorn Helgaas wrote:

> From: Jon Pan-Doh <pan...@google.com>
> 
> Allow userspace to read/write log ratelimits per device (including
> enable/disable). Create aer/ sysfs directory to store them and any
> future aer configs.
> 
> Update AER sysfs ABI filename to reflect the broader scope of AER sysfs
> attributes (e.g. stats and ratelimits).
> 
>   Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats ->
>     sysfs-bus-pci-devices-aer
> 
> Tested using aer-inject[1]. Configured correctable log ratelimit to 5.
> Sent 6 AER errors. Observed 5 errors logged while AER stats
> (cat /sys/bus/pci/devices/<dev>/aer_dev_correctable) shows 6.
> 
> Disabled ratelimiting and sent 6 more AER errors. Observed all 6 errors
> logged and accounted in AER stats (12 total errors).
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/aer-inject.git
> 
> Signed-off-by: Karolina Stolarek <karolina.stola...@oracle.com>
> Signed-off-by: Jon Pan-Doh <pan...@google.com>
> Signed-off-by: Bjorn Helgaas <bhelg...@google.com>
> Acked-by: Paul E. McKenney <paul...@kernel.org>
> ---
>  ...es-aer_stats => sysfs-bus-pci-devices-aer} | 34 +++++++
>  Documentation/PCI/pcieaer-howto.rst           |  5 +-
>  drivers/pci/pci-sysfs.c                       |  1 +
>  drivers/pci/pci.h                             |  1 +
>  drivers/pci/pcie/aer.c                        | 99 +++++++++++++++++++
>  5 files changed, 139 insertions(+), 1 deletion(-)
>  rename Documentation/ABI/testing/{sysfs-bus-pci-devices-aer_stats => 
> sysfs-bus-pci-devices-aer} (77%)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats 
> b/Documentation/ABI/testing/sysfs-bus-pci-devices-aer
> similarity index 77%
> rename from Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats
> rename to Documentation/ABI/testing/sysfs-bus-pci-devices-aer
> index d1f67bb81d5d..771204197b71 100644
> --- a/Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats
> +++ b/Documentation/ABI/testing/sysfs-bus-pci-devices-aer
> @@ -117,3 +117,37 @@ Date:            July 2018
>  KernelVersion:       4.19.0
>  Contact:     linux-...@vger.kernel.org, raja...@google.com
>  Description: Total number of ERR_NONFATAL messages reported to rootport.
> +
> +PCIe AER ratelimits
> +-------------------
> +
> +These attributes show up under all the devices that are AER capable.
> +They represent configurable ratelimits of logs per error type.
> +
> +See Documentation/PCI/pcieaer-howto.rst for more info on ratelimits.
> +
> +What:                /sys/bus/pci/devices/<dev>/aer/ratelimit_log_enable
> +Date:                March 2025
> +KernelVersion:       6.15.0

This ship has sailed.

> +Contact:     linux-...@vger.kernel.org, pan...@google.com
> +Description: Writing 1/0 enables/disables AER log ratelimiting. Reading
> +             gets whether or not AER is currently enabled.

AER or AER ratelimiting is enabled?

> +             Enabled by
> +             default.
> +
> +What:                /sys/bus/pci/devices/<dev>/aer/ratelimit_burst_cor_log
> +Date:                March 2025
> +KernelVersion:       6.15.0
> +Contact:     linux-...@vger.kernel.org, pan...@google.com
> +Description: Ratelimit burst for correctable error logs. Writing a value
> +             changes the number of errors (burst) allowed per interval
> +             (5 second window) before ratelimiting. Reading gets the
> +             current ratelimit burst.
> +
> +What:                /sys/bus/pci/devices/<dev>/aer/ratelimit_burst_uncor_log
> +Date:                March 2025
> +KernelVersion:       6.15.0
> +Contact:     linux-...@vger.kernel.org, pan...@google.com
> +Description: Ratelimit burst for uncorrectable error logs. Writing a
> +             value changes the number of errors (burst) allowed per
> +             interval (5 second window) before ratelimiting. Reading
> +             gets the current ratelimit burst.
> diff --git a/Documentation/PCI/pcieaer-howto.rst 
> b/Documentation/PCI/pcieaer-howto.rst
> index 896d2a232a90..043cdb3194be 100644
> --- a/Documentation/PCI/pcieaer-howto.rst
> +++ b/Documentation/PCI/pcieaer-howto.rst
> @@ -96,12 +96,15 @@ type (correctable vs. uncorrectable).
>  AER uses the default ratelimit of DEFAULT_RATELIMIT_BURST (10 events) over
>  DEFAULT_RATELIMIT_INTERVAL (5 seconds).
>  
> +Ratelimits are exposed in the form of sysfs attributes and configurable.
> +See Documentation/ABI/testing/sysfs-bus-pci-devices-aer.
> +
>  AER Statistics / Counters
>  -------------------------
>  
>  When PCIe AER errors are captured, the counters / statistics are also exposed
>  in the form of sysfs attributes which are documented at
> -Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats
> +Documentation/ABI/testing/sysfs-bus-pci-devices-aer.
>  
>  Developer Guide
>  ===============
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index c6cda56ca52c..278de99b00ce 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1805,6 +1805,7 @@ const struct attribute_group *pci_dev_attr_groups[] = {
>       &pcie_dev_attr_group,
>  #ifdef CONFIG_PCIEAER
>       &aer_stats_attr_group,
> +     &aer_attr_group,
>  #endif
>  #ifdef CONFIG_PCIEASPM
>       &aspm_ctrl_attr_group,
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 65c466279ade..a3261e842d6d 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -963,6 +963,7 @@ void pci_no_aer(void);
>  void pci_aer_init(struct pci_dev *dev);
>  void pci_aer_exit(struct pci_dev *dev);
>  extern const struct attribute_group aer_stats_attr_group;
> +extern const struct attribute_group aer_attr_group;
>  void pci_aer_clear_fatal_status(struct pci_dev *dev);
>  int pci_aer_clear_status(struct pci_dev *dev);
>  int pci_aer_raw_clear_status(struct pci_dev *dev);
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index c335e0bb9f51..42df5cb963b3 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -627,6 +627,105 @@ const struct attribute_group aer_stats_attr_group = {
>       .is_visible = aer_stats_attrs_are_visible,
>  };
>  
> +/*
> + * Ratelimit enable toggle
> + * 0: disabled with ratelimit.interval = 0
> + * 1: enabled with ratelimit.interval = nonzero
> + */
> +static ssize_t ratelimit_log_enable_show(struct device *dev,
> +                                      struct device_attribute *attr,
> +                                      char *buf)
> +{
> +     struct pci_dev *pdev = to_pci_dev(dev);
> +     bool enabled = pdev->aer_report->cor_log_ratelimit.interval != 0;
> +
> +     return sysfs_emit(buf, "%d\n", enabled);
> +}
> +
> +static ssize_t ratelimit_log_enable_store(struct device *dev,
> +                                       struct device_attribute *attr,
> +                                       const char *buf, size_t count)
> +{
> +     struct pci_dev *pdev = to_pci_dev(dev);
> +     bool enable;
> +     int interval;
> +
> +     if (!capable(CAP_SYS_ADMIN))
> +             return -EPERM;
> +
> +     if (kstrtobool(buf, &enable) < 0)
> +             return -EINVAL;
> +
> +     if (enable)
> +             interval = DEFAULT_RATELIMIT_INTERVAL;
> +     else
> +             interval = 0;
> +
> +     pdev->aer_report->cor_log_ratelimit.interval = interval;
> +     pdev->aer_report->uncor_log_ratelimit.interval = interval;
> +
> +     return count;
> +}
> +static DEVICE_ATTR_RW(ratelimit_log_enable);
> +
> +#define aer_ratelimit_burst_attr(name, ratelimit)                    \
> +     static ssize_t                                                  \
> +     name##_show(struct device *dev, struct device_attribute *attr,  \
> +                 char *buf)                                          \
> +{                                                                    \
> +     struct pci_dev *pdev = to_pci_dev(dev);                         \
> +                                                                     \
> +     return sysfs_emit(buf, "%d\n",                                  \
> +                       pdev->aer_report->ratelimit.burst);           \
> +}                                                                    \
> +                                                                     \
> +     static ssize_t                                                  \
> +     name##_store(struct device *dev, struct device_attribute *attr, \
> +                  const char *buf, size_t count)                     \
> +{                                                                    \
> +     struct pci_dev *pdev = to_pci_dev(dev);                         \
> +     int burst;                                                      \
> +                                                                     \
> +     if (!capable(CAP_SYS_ADMIN))                                    \
> +             return -EPERM;                                          \
> +                                                                     \
> +     if (kstrtoint(buf, 0, &burst) < 0)                              \
> +             return -EINVAL;                                         \
> +                                                                     \
> +     pdev->aer_report->ratelimit.burst = burst;                      \
> +                                                                     \
> +     return count;                                                   \
> +}                                                                    \
> +static DEVICE_ATTR_RW(name)
> +
> +aer_ratelimit_burst_attr(ratelimit_burst_cor_log, cor_log_ratelimit);
> +aer_ratelimit_burst_attr(ratelimit_burst_uncor_log, uncor_log_ratelimit);
> +
> +static struct attribute *aer_attrs[] = {
> +     &dev_attr_ratelimit_log_enable.attr,
> +     &dev_attr_ratelimit_burst_cor_log.attr,
> +     &dev_attr_ratelimit_burst_uncor_log.attr,
> +     NULL
> +};
> +
> +static umode_t aer_attrs_are_visible(struct kobject *kobj,
> +                                  struct attribute *a, int n)
> +{
> +     struct device *dev = kobj_to_dev(kobj);
> +     struct pci_dev *pdev = to_pci_dev(dev);
> +
> +     if (!pdev->aer_report)
> +             return 0;
> +
> +     return a->mode;
> +}
> +
> +const struct attribute_group aer_attr_group = {
> +     .name = "aer",
> +     .attrs = aer_attrs,
> +     .is_visible = aer_attrs_are_visible,
> +};
> +
>  static void pci_dev_aer_stats_incr(struct pci_dev *pdev,
>                                  struct aer_err_info *info)
>  {
> 

-- 
 i.


Reply via email to