Public bug reported:

[Description]
  This patch stack backports the CXL enablement needed by DGX CXL
  platforms to the linux-nvidia-7.0 kernel. It combines the CXL
  dependency/Type-2 stack with the CXL state save/restore and
  reset stack.
  It includes:
  1. CXL Type-2 and accelerator support
  This brings in the CXL Type-2 foundation and accelerator CXL
  plumbing required for accelerator-attached CXL memory devices:
  - CXL Type-2 support in cxl_dev_state initialization
  - Exported CXL internals needed by external Type-2 drivers
  - Accelerator CXL device creation and CXL register mapping
  - Type-2 memdev creation and attach-region handling
  - Avoiding DAX creation for accelerator memdevs
  2. DAX/HMEM and Soft Reserved coordination
  This carries the DAX/HMEM coordination needed so Soft Reserved
  memory ownership is resolved correctly with CXL regions:
  - Deferred dax_cxl binding until dax_hmem ownership resolution
  - Soft Reserved containment checks against committed CXL regions
  - Reintroduction of Soft Reserved ranges into the iomem tree
  - DEV_DAX_CXL gating and cxl_acpi/cxl_pci module request
  ordering
  3. CXL configuration and platform dependencies
  The stack includes CXL config annotations and related platform
  dependencies:
  - CXL Type-2 and RAS config annotations
  - CXL DAX and KMEM config enablement
  - PCI_CXL annotation for CXL state save/restore
  - ATS enablement dependencies needed by pre-CXL and
  CXL.cache-capable devices
  4. PCI CXL state save/restore
  This backports Srirangan Madhavan's CXL state save/restore
  series:
  - CXL DVSEC control, lock, and range register definitions
  - Public CXL HDM decoder/register-map definitions
  - PCI virtual extended capability save buffer support
  - CXL DVSEC state save/restore across resets
  - HDM decoder state save/restore
  - PCI CXL save/restore wiring via drivers/pci/cxl.c
  5. CXL reset v5 support
  This also backports the CXL reset v5 series:
  - Revert of the older single-commit CXL reset implementation
  - CXL DVSEC reset and capability register definitions
  - Export of pci_dev_save_and_disable() and pci_dev_restore()
  - CXL memory offlining and cache flush helpers
  - Multi-function sibling coordination for CXL reset
  - Full CXL reset flow orchestration
  - cxl_reset sysfs interface for PCI devices
  - ABI documentation for the cxl_reset sysfs attribute
[Justification]
  This backport is required for DGX CXL enablement on the
  linux-nvidia-7.0 kernel. The combined stack enables CXL Type-2
  accelerator memory support, correct Soft Reserved ownership
  handling, CXL PCI state preservation, and controlled CXL device
  reset flows.
  Without this stack:
  - Type-2 accelerator CXL memory devices cannot be represented
  correctly
  - Accelerator CXL device plumbing is missing
  - Soft Reserved memory may be claimed by the wrong DAX path
  - PCI reset paths can lose CXL DVSEC/HDM decoder state
  - The newer CXL reset flow and cxl_reset sysfs interface are
  unavailable
  Source Patch Breakdown
  1. CXL dependency and Type-2 backport
  Includes Type-2 CXL, accelerator CXL plumbing, DAX/HMEM Soft
  Reserved coordination, CXL interleaving support, RAS/config
  annotations, and platform dependencies.
  2. CXL state save/restore and reset backport
  Includes Srirangan Madhavan's CXL state save/restore series and
  CXL reset v5 series, plus the cxl_reset sysfs ABI documentation.
  Branch / Review Context
  Current branch:
  bug-DGX-16137/cxl-backport-26.04-bos-nvpr
  Base branch:
  bug-DGX-16136/cxl-backport-26.04-bos
  The final branch range contains only the CXL save/restore and
  reset commits after the CXL dependency/Type-2 base. Duplicate
  DAX commits mistakenly left during rebase were removed.

[Testing]
  Build Validation:
  - Remote arm64 nvidia-bos whole-kernel build passed for the
  final CXL save/restore and reset stack.
  - Build command covered Image, modules, and dtbs.
  - Produced vmlinux and arch/arm64/boot/Image.
  Reset Validation:
  - CXL reset validation passed on DGX CXL devices.
  - CXL control/range state was preserved across reset.
  - No fatal CXL/PCI/AER/DPC dmesg messages were observed.
  - ResetComplete transitioned as expected and ResetError remained
   clear.
  Static Validation:
  - Branch audit found no duplicate commits against the CXL
  dependency/Type-2 base after cleanup.
  - reset_done() ERR_PTR endpoint dereference review finding was
  fixed in the reset orchestration commit.
  - Focused static guard/order check passed.
  - checkpatch on the touched CXL diff passed.
  Config Verification:
  The stack includes config annotations for:
  - CXL_BUS
  - CXL_PCI
  - CXL_MEM
  - CXL_REGION
  - CXL_RAS
  - CXL DAX/KMEM support
  - PCI_CXL

[Notes]
  This series depends on the CXL dependency/Type-2 backport and
  layers CXL state save/restore plus reset support on top.
  The older single-commit CXL reset implementation is reverted
  before applying the newer reset v5 flow to avoid duplicated
  reset helpers and stale DVSEC definitions.

** Affects: linux-nvidia-7.0 (Ubuntu)
     Importance: Undecided
         Status: New

** Also affects: linux-nvidia-7.0 (Ubuntu)
   Importance: Undecided
       Status: New

** No longer affects: linux-nvidia-6.17 (Ubuntu)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2153819

Title:
  CXL: Backport Type-2, state save/restore, and reset support

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-7.0/+bug/2153819/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to