AER specifics

Niklas Schnelle Mon, 15 Sep 2025 12:22:33 -0700

On Mon, 2025-09-15 at 15:50 +0200, Lukas Wunner wrote:
> Amend the documentation on PCI error recovery with specifics about
> Downstream Port Containment and Advanced Error Reporting:
> 
> * Explain that with DPC, devices are inaccessible upon an error (similar
>   to EEH on powerpc) and do not become accessible until the link is
>   re-enabled.
> 
> * Explain that with AER, although devices may already be accessible in the
>   ->error_detected() callback, accesses should be deferred to the
>   ->mmio_enabled() callback for compatibility with EEH on powerpc and with
>   s390.
> 
> Signed-off-by: Lukas Wunner <lu...@wunner.de>
> Reviewed-by: Brian Norris <briannor...@chromium.org>
> ---
>  Documentation/PCI/pci-error-recovery.rst | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/Documentation/PCI/pci-error-recovery.rst 
> b/Documentation/PCI/pci-error-recovery.rst
> index d5c661baa87f..9e1e2f2a13fa 100644
> --- a/Documentation/PCI/pci-error-recovery.rst
> +++ b/Documentation/PCI/pci-error-recovery.rst
> @@ -122,6 +122,10 @@ A PCI bus error is detected by the PCI hardware.  On 
> powerpc, the slot
>  is isolated, in that all I/O is blocked: all reads return 0xffffffff,
>  all writes are ignored.
>  
> +Similarly, on platforms supporting Downstream Port Containment
> +(PCIe r7.0 sec 6.2.11), the link to the sub-hierarchy with the
> +faulting device is disabled. Any device in the sub-hierarchy
> +becomes inaccessible.
>  
>  STEP 1: Notification
>  --------------------
> @@ -204,6 +208,24 @@ link reset was performed by the HW. If the platform 
> can't just re-enable IOs
>  without a slot reset or a link reset, it will not call this callback, and
>  instead will have gone directly to STEP 3 (Link Reset) or STEP 4 (Slot Reset)
>  
> +.. note::
> +
> +   On platforms supporting Advanced Error Reporting (PCIe r7.0 sec 6.2),
> +   the faulting device may already be accessible in STEP 1 (Notification).
> +   Drivers should nevertheless defer accesses to STEP 2 (MMIO Enabled)
> +   to be compatible with EEH on powerpc and with s390 (where devices are
> +   inaccessible until STEP 2).
> +
> +   On platforms supporting Downstream Port Containment, the link to the
> +   sub-hierarchy with the faulting device is re-enabled in STEP 3 (Link
> +   Reset). Hence devices in the sub-hierarchy are inaccessible until
> +   STEP 4 (Slot Reset).
> +
> +   For errors such as Surprise Down (PCIe r7.0 sec 6.2.7), the device
> +   may not even be accessible in STEP 4 (Slot Reset). Drivers can detect
> +   accessibility by checking whether reads from the device return all 1's
> +   (PCI_POSSIBLE_ERROR()).
> +
>  .. note::
>  
>     The following is proposed; no platform implements this yet:


Thanks for improving this. Makes sense to mention and spell this out
explicitly.

Reviewed-by: Niklas Schnelle <schne...@linux.ibm.com>

Re: [PATCH v2 3/4] Documentation: PCI: Amend error recovery doc with DPC/AER specifics

Reply via email to