On 12/2/2025 7:50 PM, [email protected] wrote:
Smita Koralahalli wrote:
Introduce cxl_regions_fully_map() to check whether CXL regions form a
single contiguous, non-overlapping cover of a given Soft Reserved range.

Use this helper to decide whether Soft Reserved memory overlapping CXL
regions should be owned by CXL or registered by HMEM.

If the span is fully covered by CXL regions, treat the Soft Reserved
range as owned by CXL and have HMEM skip registration. Else, let HMEM
claim the range and register the corresponding devdax for it.

This all feels a bit too custom when helpers like resource_contains()
exist.

Also remember that the default list of soft-reserved ranges that dax
grabs is filtered by the ACPI HMAT. So while there is a chance that one
EFI memory map entry spans multiple CXL regions, there is a lower chance
that a single ACPI HMAT range spans multiple CXL regions.

I think it is fair for Linux to be simple and require that an algorithm
of:

cxl_contains_soft_reserve()
     for_each_cxl_intersecting_hmem_resource()
         found = false
         for_each_region()
            if (resource_contains(cxl_region_resource, hmem_resource))
                found = true
         if (!found)
             return false
     return true

...should be good enough, otherwise fallback to pure hmem operation, and
do not worry about the corner cases.

If Linux really needs to understand that ACPI HMAT ranges may span
multiple CXL regions then I would want to understand more what is
driving that configuration.

I was trying to handle a case like Tomasz's setup in [2], where a single Soft Reserved span and CFMWS cover two CXL regions:

kernel: [ 0.000000][ T0] BIOS-e820: [mem 0x0000000a90000000-0x0000000c8fffffff] soft reserved

a90000000-c8fffffff : CXL Window 0
  a90000000-b8fffffff : region1
  b90000000-c8fffffff : region0

…so I ended up with the more generic cxl_regions_fully_map() walker. I missed the detail that the HMAT filtered Soft reserved ranges we actually act on are much less likely to span multiple regions, and on AMD platforms we effectively have a 1:1 mapping. Im fine with simplifying this per your suggestion.


Btw, I do not see a:

     guard(rwsem_read)(&cxl_rwsem.region)

...anywhere in the proposed patch. That needs to be held be sure the
region's resource settings are not changed out from underneath you. This
should probably also be checking that the region is in the commit state
because it may still be racing regions under creation post
wait_for_device_probe().

Sure, I will add this.


  void cxl_endpoint_parse_cdat(struct cxl_port *port);
diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c
index f70a0688bd11..db4c46337ac3 100644
--- a/drivers/dax/hmem/hmem.c
+++ b/drivers/dax/hmem/hmem.c
@@ -3,6 +3,8 @@
  #include <linux/memregion.h>
  #include <linux/module.h>
  #include <linux/dax.h>
+
+#include "../../cxl/cxl.h"
  #include "../bus.h"
static bool region_idle;
@@ -150,7 +152,17 @@ static int hmem_register_device(struct device *host, int 
target_nid,
  static int handle_deferred_cxl(struct device *host, int target_nid,
                               const struct resource *res)
  {
-       /* TODO: Handle region assembly failures */
+       if (region_intersects(res->start, resource_size(res), IORESOURCE_MEM,
+                             IORES_DESC_CXL) != REGION_DISJOINT) {
+
+               if (cxl_regions_fully_map(res->start, res->end))
+                       dax_cxl_mode = DAX_CXL_MODE_DROP;
+               else
+                       dax_cxl_mode = DAX_CXL_MODE_REGISTER;
+
+               hmem_register_device(host, target_nid, res);
+       }
+

I think there is enough content to just create the new
cxl_contains_soft_reserve() ABI, and then hookup handle_deferred_cxl in
a follow-on patch.

Okay.

Thanks
Smita


Reply via email to