On 27/01/26 04:02PM, [email protected] wrote:
Neeraj Kumar wrote:
In 84ec985944ef3, devm_cxl_add_nvdimm() sequence was changed and called
before devm_cxl_add_endpoint(). It's because cxl pmem region auto-assembly
used to get called at last in cxl_endpoint_port_probe(), which requires
cxl_nvd presence.
What?
Hi Dan,
Auto-region assembly was added by a32320b71f085 [1] and during that time
devm_cxl_add_nvdimm() was called after devm_cxl_add_endpoint().
Later Li Ming found issue (kernel panic) in pmem region auto-assembly
and fixed it using 84ec985944ef3 [2]. During this fix devm_cxl_add_nvdimm()
sequence was changed and called before devm_cxl_add_endpoint()
In this patch-set I have tried to bring back the original sequence in order
to create pmem region (single interleave) based of parsed information from
LSA during nvdimm_probe()
[1]
https://lore.kernel.org/r/167601999958.1924368.9366954455835735048.st...@dwillia2-xfh.jf.intel.com
[2] https://lore.kernel.org/all/[email protected]
For cxl region persistency, region creation happens during nvdimm_probe
which need the completion of endpoint probe.
In order to accommodate both cxl pmem region auto-assembly and cxl region
persistency, refactored following
1. Re-Sequence devm_cxl_add_nvdimm() after devm_cxl_add_endpoint(). This
will be called only after successful completion of endpoint probe.
2. Create cxl_region_discovery() which performs pmem region
auto-assembly and remove cxl pmem region auto-assembly from
cxl_endpoint_port_probe()
3. Register cxl_region_discovery() with devm_cxl_add_memdev() which gets
called during cxl_pci_probe() in context of cxl_mem_probe()
4. As cxlmd->attach->probe() calls registered cxl_region_discovery(), so
move devm_cxl_add_nvdimm() before cxlmd->attach->probe(). It guarantees
both the completion of endpoint probe and cxl_nvd presence before
calling cxlmd->attach->probe().
This does not make sense. The whole point of having
devm_cxl_add_nvdimm() before devm_cxl_add_endpoint() is so that the
typical region discovery path can consider pre-existing decoder settings
*or* nvdimm labels in its assembly decisions.
I would be surprised if this passes existing region assembly and
ordering tests.
Actually using following change (as Dave has also checked) with the
current patch-set is making existing region assembly ordering test pass.
But still some other testcase in cxl_test cxl-topology.sh is failing.
---
diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index cb87e8c0e63c..03af15edd988 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -1686,6 +1686,7 @@ static void cxl_mock_test_feat_init(struct
cxl_mockmem_data *mdata)
static int cxl_mock_mem_probe(struct platform_device *pdev)
{
+ struct cxl_memdev_attach memdev_attach = { 0 };
struct device *dev = &pdev->dev;
struct cxl_memdev *cxlmd;
struct cxl_memdev_state *mds;
@@ -1767,7 +1768,8 @@ static int cxl_mock_mem_probe(struct platform_device
*pdev)
cxl_mock_add_event_logs(&mdata->mes);
- cxlmd = devm_cxl_add_memdev(cxlds, NULL);
+ memdev_attach.probe = cxl_region_discovery;
+ cxlmd = devm_cxl_add_memdev(cxlds, &memdev_attach);
if (IS_ERR(cxlmd))
return PTR_ERR(cxlmd);
---
This reads like "do not understand current ordering, change it for thin
reasons".
Yes I got your concern that we should reuse the existing infra of region auto
assembly instead of doing this refactoring and creating new infra for region
creation based on information parsed from LSA.
Even the same is discussed in last CXL Collab Sync [1].
I will fix this refactoring along with multi interleave support using existing
auto region assembly infra in next series.
[1]
https://pmem.io/ndctl/collab/#:~:text=LSA%202.1%20support,with%20first%20merge%3F
Regards,
Neeraj