> On May 16, 2025, at 11:30, Itaru Kitayama <itaru.kitay...@linux.dev> wrote:
> 
> Hi Jonathan,
> 
>> On May 13, 2025, at 20:14, Jonathan Cameron <jonathan.came...@huawei.com> 
>> wrote:
>> 
>> V13:
>> - Make CXL fixed memory windows sysbus devices.
>> IIRC this was requested by Peter in one of the reviews a long time back
>> but at the time the motivation was less strong than it becomes with some
>> WiP patches for hotness monitoring and high performance direct connect
>> where we need a machine type independent way to iterate all the CXL
>> fixed memory windows. This is a convenient place to do it so drag that
>> work forward into this series.
>> 
>> This allows us to drop separate list and necessary machine specific
>> access code in favour of
>> object_child_foreach_recursive(object_get_root(),...)
>> One snag is that the ordering of multiple fixed memory windows in that
>> walk depends on the underlying g_hash_table iterations rather than the
>> order of creation. In the memory map layout and ACPI table creation we
>> need both stable and predictable ordering. Resolve this in a similar
>> fashion to object_class_get_list_sorted() be throwing them in a GSList
>> and sorting that. Only use this when a sorted list is needed.
>> 
>> Dropped RFC as now I'm happy with this code and would like to get it
>> upstream!  Particularly as it broken even today due to enscripten
>> related changes that stop us using g_slist_sort(). Easy fix though.
>> 
>> Note that we have an issue for CXL emulation in general and TCG which
>> is being discussed in:
>> https://lore.kernel.org/all/20250425183524.00000...@huawei.com/
>> (also affects some other platforms)
>> 
>> Until that is resolved, either rebase this back on 10.0 or just
>> don't let code run out of it (don't use KMEM to expose it as normal
>> memory, use DAX instead).
>> 
>> Previous cover letter.
>> 
>> Back in 2022, this series stalled on the absence of a solution to device
>> tree support for PCI Expander Bridges (PXB) and we ended up only having
>> x86 support upstream. I've been carrying the arm64 support out of tree
>> since then, with occasional nasty surprises (e.g. UNIMP + DT issue seen
>> a few weeks ago) and a fair number of fiddly rebases.
>> gitlab.com/jic23/qemu cxl-<latest date>
>> 
>> A recent discussion with Peter Maydell indicated that there are various
>> other ACPI only features now, so in general he might be more relaxed
>> about DT support being necessary. The upcoming vSMMUv3 support would
>> run into this problem as well.
>> 
>> I presented the background to the PXB issue at Linaro connect 2022. In
>> short the issue is that PXBs steal MMIO space from the main PCI root
>> bridge. The challenge is knowing how much to steal.
>> 
>> On ACPI platforms, we can rely on EDK2 to perform an enumeration and
>> configuration of the PCI topology and QEMU can update the ACPI tables
>> after EDK2 has done this when it can simply read the space used by the
>> root ports. On device tree, there is no entity to figure out that
>> enumeration so we don't know how to size the stolen region.
>> 
>> Three approaches were discussed:
>> 1) Enumerating in QEMU. Horribly complex and the last thing we want is a
>>  3rd enumeration implementation that ends up out of sync with EDK2 and
>>  the kernel (there are frequent issues because of how those existing
>>  implementations differ.
>> 2) Figure out how to enumerate in kernel. I never put a huge amount of work
>>  into this, but it seemed likely to involve a nasty dance with similar
>>  very specific code to that EDK2 is carrying and would very challenging
>>  to upstream (given the lack of clarity on real use cases for PXBs and
>>  DT).
>> 3) Hack it based on the control we have which is bus numbers.
>>  No one liked this but it worked :)
>> 
>> The other little wrinkle would be the need to define full bindings for CXL
>> on DT + implement a fairly complex kernel stack as equivalent in ACPI
>> involves a static table, CEDT, new runtime queries via _DSM and a description
>> of various components. Doable, but so far there is no interest on physical
>> platforms. Worth noting that for now, the QEMU CXL emulation is all about
>> testing and developing the OS stack, not about virtualization (performance
>> is terrible except in some very contrived situations!)
>> 
>> Back to posting as an RFC because there was some discussion of approach to
>> modelling the devices that may need a bit of redesign.
>> The discussion kind of died out on the back of DT issue and I doubt anyone
>> can remember the details.
>> 
>> https://lore.kernel.org/qemu-devel/20220616141950.23374-1-jonathan.came...@huawei.com/
>> 
>> There is only a very simple test in here, because my intent is not to
>> duplicate what we have on x86, but just to do a smoke test that everything
>> is hooked up.  In general we need much more comprehensive end to end CXL
>> tests but that requires a reaonsably stable guest software stack. A few
>> people have expressed interest in working on that, but we aren't there yet.
>> 
>> Note that this series has a very different use case to that in the proposed
>> SBSA-ref support:
>> https://lore.kernel.org/qemu-devel/20250117034343.26356-1-wangyuquan1...@phytium.com.cn/
>> 
>> SBSA-ref is a good choice if you want a relatively simple mostly fixed
>> configuration.  That works well with the limited host system
>> discoverability etc as EDK2 can be build against a known configuration.
>> 
>> My interest with this support in arm/virt is support host software stack
>> development (we have a wide range of contributors, most of whom are working
>> on emulation + the kernel support). I care about the weird corners. As such
>> I need to be able to bring up variable numbers of host bridges, multiple CXL
>> Fixed Memory Windows with varying characteristics (interleave etc), complex
>> NUMA topologies with wierd performance characteristics etc. We can do that
>> on x86 upstream today, or my gitlab tree. Note that we need arm support
>> for some arch specific features in the near future (cache flushing).
>> Doing kernel development with this need for flexibility on SBSA-ref is not
>> currently practical. SBSA-ref CXL support is an excellent thing, just
>> not much use to me for this work.
>> 
>> Jonathan Cameron (5):
>> hw/cxl-host: Add an index field to CXLFixedMemoryWindow
>> hw/cxl: Make the CXL fixed memory windows devices.
>> hw/cxl-host: Allow split of establishing memory address and mmio
>>   setup.
>> hw/arm/virt: Basic CXL enablement on pci_expander_bridge instances
>>   pxb-cxl
>> qtest/cxl: Add aarch64 virt test for CXL
>> 
>> include/hw/arm/virt.h     |   4 +
>> include/hw/cxl/cxl.h      |   4 +
>> include/hw/cxl/cxl_host.h |   6 +-
>> hw/acpi/cxl.c             |  83 +++++++++------
>> hw/arm/virt-acpi-build.c  |  34 ++++++
>> hw/arm/virt.c             |  29 +++++
>> hw/cxl/cxl-host-stubs.c   |   8 +-
>> hw/cxl/cxl-host.c         | 218 ++++++++++++++++++++++++++++++++------
>> hw/i386/pc.c              |  51 ++++-----
>> tests/qtest/cxl-test.c    |  59 ++++++++---
>> tests/qtest/meson.build   |   1 +
>> 11 files changed, 389 insertions(+), 108 deletions(-)
>> 
>> -- 
>> 2.43.0
>> 
> 
> With your series applied on top of upstream QEMU, the -drive option does not 
> work well with the sane CXL
> setup (I use run_qemu.sh maintained by Marc et al. at Intel) see below:
> 
> /home/realm/projects/qemu/build/qemu-system-aarch64 -machine 
> virt,accel=tcg,cxl=on,highmem=on,compact-highmem=on,highmem-ecam=on,highmem-mmio=on
>  -m 2048M,slots=0,maxmem=6144M -smp 2,sockets=1,cores=2,threads=1 -display 
> none -nographic -drive 
> if=pflash,format=raw,unit=0,file=AAVMF_CODE.fd,readonly=on -drive 
> if=pflash,format=raw,unit=1,file=AAVMF_VARS.fd -drive 
> file=root.img,format=raw,media=disk -kernel 
> mkosi.extra/boot/vmlinuz-6.15.0-rc4-00040-g128ad8fa385b -initrd 
> mkosi.extra/boot/initramfs-6.15.0-rc4-00040-g128ad8fa385b.img -append 
> selinux=0 audit=0 console=tty0 console=ttyS0 
> root=PARTUUID=14d6bae9-c917-435d-89ea-99af1fa4439a ignore_loglevel rw 
> initcall_debug log_buf_len=20M memory_hotplug.memmap_on_memory=force 
> cxl_acpi.dyndbg=+fplm cxl_pci.dyndbg=+fplm cxl_core.dyndbg=+fplm 
> cxl_mem.dyndbg=+fplm cxl_pmem.dyndbg=+fplm cxl_port.dyndbg=+fplm 
> cxl_region.dyndbg=+fplm cxl_test.dyndbg=+fplm cxl_mock.dyndbg=+fplm 
> cxl_mock_mem.dyndbg=+fplm systemd.set_credential=agetty.autologin:root 
> systemd.set_credential=login.noauth:yes -device 
> e1000,netdev=net0,mac=52:54:00:12:34:56 -netdev 
> user,id=net0,hostfwd=tcp::10022-:22 -cpu max -object 
> memory-backend-file,id=cxl-mem0,share=on,mem-path=cxltest0.raw,size=256M 
> -object 
> memory-backend-file,id=cxl-mem1,share=on,mem-path=cxltest1.raw,size=256M 
> -object 
> memory-backend-file,id=cxl-mem2,share=on,mem-path=cxltest2.raw,size=256M 
> -object 
> memory-backend-file,id=cxl-mem3,share=on,mem-path=cxltest3.raw,size=256M 
> -object memory-backend-file,id=cxl-lsa0,share=on,mem-path=lsa0.raw,size=128K 
> -object memory-backend-file,id=cxl-lsa1,share=on,mem-path=lsa1.raw,size=128K 
> -object memory-backend-file,id=cxl-lsa2,share=on,mem-path=lsa2.raw,size=128K 
> -object memory-backend-file,id=cxl-lsa3,share=on,mem-path=lsa3.raw,size=128K 
> -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=53 -device 
> pxb-cxl,id=cxl.1,bus=pcie.0,bus_nr=191 -device 
> cxl-rp,id=hb0rp0,bus=cxl.0,chassis=0,slot=0,port=0 -device 
> cxl-rp,id=hb0rp1,bus=cxl.0,chassis=0,slot=1,port=1 -device 
> cxl-rp,id=hb1rp0,bus=cxl.1,chassis=0,slot=2,port=0 -device 
> cxl-rp,id=hb1rp1,bus=cxl.1,chassis=0,slot=3,port=1 -device 
> cxl-upstream,port=4,bus=hb0rp0,id=cxl-up0,multifunction=on,addr=0.0,sn=12345678
>  -device cxl-switch-mailbox-cci,bus=hb0rp0,addr=0.1,target=cxl-up0 -device 
> cxl-upstream,port=4,bus=hb1rp0,id=cxl-up1,multifunction=on,addr=0.0,sn=12341234
>  -device cxl-switch-mailbox-cci,bus=hb1rp0,addr=0.1,target=cxl-up1 -device 
> cxl-downstream,port=0,bus=cxl-up0,id=swport0,chassis=0,slot=4 -device 
> cxl-downstream,port=1,bus=cxl-up0,id=swport1,chassis=0,slot=5 -device 
> cxl-downstream,port=2,bus=cxl-up0,id=swport2,chassis=0,slot=6 -device 
> cxl-downstream,port=3,bus=cxl-up0,id=swport3,chassis=0,slot=7 -device 
> cxl-downstream,port=0,bus=cxl-up1,id=swport4,chassis=0,slot=8 -device 
> cxl-downstream,port=1,bus=cxl-up1,id=swport5,chassis=0,slot=9 -device 
> cxl-downstream,port=2,bus=cxl-up1,id=swport6,chassis=0,slot=10 -device 
> cxl-downstream,port=3,bus=cxl-up1,id=swport7,chassis=0,slot=11 -device 
> cxl-type3,bus=swport0,persistent-memdev=cxl-mem0,id=cxl-pmem0,lsa=cxl-lsa0 
> -device 
> cxl-type3,bus=swport2,persistent-memdev=cxl-mem1,id=cxl-pmem1,lsa=cxl-lsa1 
> -device 
> cxl-type3,bus=swport4,volatile-memdev=cxl-mem2,id=cxl-vmem2,lsa=cxl-lsa2 
> -device 
> cxl-type3,bus=swport6,volatile-memdev=cxl-mem3,id=cxl-vmem3,lsa=cxl-lsa3 -M 
> cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=8k,cxl-fmw.1.targets.0=cxl.0,cxl-fmw.1.targets.1=cxl.1,cxl-fmw.1.size=4G,cxl-fmw.1.interleave-granularity=8k
>  -snapshot -object memory-backend-ram,id=mem0,size=2048M -numa 
> node,nodeid=0,memdev=mem0, -numa cpu,node-id=0,socket-id=0 -numa 
> dist,src=0,dst=0,val=10
> qemu-system-aarch64: -drive file=root.img,format=raw,media=disk: PCI: Only 
> PCI/PCIe bridges can be plugged into pxb-cxl
> 
> Plain upstream QEMU aarch64 target vert machine can handle the -drive option 
> without an issue _without_ those cxl setup options added. I think the error 
> was seen with your previous cxl-2025-03-20 branch. 
> 
> Thanks,
> Itaru.

While the above is not a show stopper for testing CXL, on the aarch64 target 
virt machine I get still errors:

[…]
 22/48 ndctl:cxl / cxl-topology.sh                  FAIL             1.06s   
exit status 1
>>> NDCTL=/root/ndctl/build/ndctl/ndctl 
>>> ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 
>>> MALLOC_PERTURB_=66 MESON_TEST_ITERATION=1 
>>> MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
>>>  
>>> LD_LIBRARY_PATH=/root/ndctl/build/ndctl/lib:/root/ndctl/build/cxl/lib:/root/ndctl/build/daxctl/lib
>>>  DATA_PATH=/root/ndctl/test TEST_PATH=/root/ndctl/build/test 
>>> DAXCTL=/root/ndctl/build/daxctl/daxctl 
>>> UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
>>>  /bin/bash /root/ndctl/test/cxl-topology.sh

23/48 ndctl:cxl / cxl-region-sysfs.sh              FAIL             1.33s   
exit status 1
>>> NDCTL=/root/ndctl/build/ndctl/ndctl 
>>> ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 
>>> MESON_TEST_ITERATION=1 
>>> MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
>>>  
>>> LD_LIBRARY_PATH=/root/ndctl/build/ndctl/lib:/root/ndctl/build/cxl/lib:/root/ndctl/build/daxctl/lib
>>>  DATA_PATH=/root/ndctl/test MALLOC_PERTURB_=252 
>>> TEST_PATH=/root/ndctl/build/test DAXCTL=/root/ndctl/build/daxctl/daxctl 
>>> UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
>>>  /bin/bash /root/ndctl/test/cxl-region-sysfs.sh

24/48 ndctl:cxl / cxl-labels.sh                    OK               2.44s
25/48 ndctl:cxl / cxl-create-region.sh             FAIL             1.09s   
exit status 1
>>> NDCTL=/root/ndctl/build/ndctl/ndctl 
>>> ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 
>>> MESON_TEST_ITERATION=1 MALLOC_PERTURB_=216 
>>> MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
>>>  
>>> LD_LIBRARY_PATH=/root/ndctl/build/ndctl/lib:/root/ndctl/build/cxl/lib:/root/ndctl/build/daxctl/lib
>>>  DATA_PATH=/root/ndctl/test TEST_PATH=/root/ndctl/build/test 
>>> DAXCTL=/root/ndctl/build/daxctl/daxctl 
>>> UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
>>>  /bin/bash /root/ndctl/test/cxl-create-region.sh

26/48 ndctl:cxl / cxl-xor-region.sh                SKIP             0.72s   
exit status 77
27/48 ndctl:cxl / cxl-events.sh                    FAIL             1.11s   
exit status 1
>>> NDCTL=/root/ndctl/build/ndctl/ndctl 
>>> ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 
>>> MESON_TEST_ITERATION=1 
>>> MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
>>>  
>>> LD_LIBRARY_PATH=/root/ndctl/build/ndctl/lib:/root/ndctl/build/cxl/lib:/root/ndctl/build/daxctl/lib
>>>  DATA_PATH=/root/ndctl/test MALLOC_PERTURB_=4 
>>> TEST_PATH=/root/ndctl/build/test DAXCTL=/root/ndctl/build/daxctl/daxctl 
>>> UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
>>>  /bin/bash /root/ndctl/test/cxl-events.sh

28/48 ndctl:cxl / cxl-sanitize.sh                  FAIL             1.19s   
exit status 1
>>> NDCTL=/root/ndctl/build/ndctl/ndctl 
>>> ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 
>>> MESON_TEST_ITERATION=1 
>>> MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
>>>  
>>> LD_LIBRARY_PATH=/root/ndctl/build/ndctl/lib:/root/ndctl/build/cxl/lib:/root/ndctl/build/daxctl/lib
>>>  DATA_PATH=/root/ndctl/test MALLOC_PERTURB_=103 
>>> TEST_PATH=/root/ndctl/build/test DAXCTL=/root/ndctl/build/daxctl/daxctl 
>>> UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
>>>  /bin/bash /root/ndctl/test/cxl-sanitize.sh

29/48 ndctl:cxl / cxl-destroy-region.sh            OK               2.38s
30/48 ndctl:cxl / cxl-qos-class.sh                 FAIL             1.69s   
exit status 1
>>> NDCTL=/root/ndctl/build/ndctl/ndctl 
>>> ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 
>>> MESON_TEST_ITERATION=1 
>>> MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
>>>  
>>> LD_LIBRARY_PATH=/root/ndctl/build/ndctl/lib:/root/ndctl/build/cxl/lib:/root/ndctl/build/daxctl/lib
>>>  DATA_PATH=/root/ndctl/test MALLOC_PERTURB_=118 
>>> TEST_PATH=/root/ndctl/build/test DAXCTL=/root/ndctl/build/daxctl/daxctl 
>>> UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
>>>  /root/ndctl/test/cxl-qos-class.sh

31/48 ndctl:cxl / cxl-poison.sh                    FAIL             0.71s   
exit status 1
>>> NDCTL=/root/ndctl/build/ndctl/ndctl 
>>> ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 
>>> MESON_TEST_ITERATION=1 
>>> MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
>>>  
>>> LD_LIBRARY_PATH=/root/ndctl/build/ndctl/lib:/root/ndctl/build/cxl/lib:/root/ndctl/build/daxctl/lib
>>>  DATA_PATH=/root/ndctl/test MALLOC_PERTURB_=174 
>>> TEST_PATH=/root/ndctl/build/test DAXCTL=/root/ndctl/build/daxctl/daxctl 
>>> UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
>>>  /bin/bash /root/ndctl/test/cxl-poison.sh
[…]

I’ll check there are due to allocation failure from the Host Physical Address 
space. 

Itaru.


Reply via email to