On Wed, Feb 02, 2022 at 02:09:54PM +0000, Jonathan Cameron wrote: > Changes since v4: > https://lore.kernel.org/linux-cxl/20220124171705.10432-1-jonathan.came...@huawei.com/ > > Note documentation patch that Alex requested to follow. > I don't want to delay getting this out as Alex mentioned possibly > having time to continue reviewing in latter part of this week. > > Issues identified by CI / Alex Bennée > - Stubs added for hw/cxl/cxl-host and hw/acpi/cxl plus related meson > changes to use them as necessary. > - Drop uid from cxl-test (result of last minute change in v4 that was not > carried through to the test) > - Fix naming clash with field name ERROR which on some arches is defined > and results in the string being replaced with 0 in some of the > register field related defines. Call it ERR instead. > - Fix type issue around mr->size by using 64 bit acessor functions. > - Add a new patch to exclude pxb-cxl from device-crash-test in similar > fashion to pxb. > > CI tests now passing with exception of checkpatch which has what > I think is a false positive and build-oss-fuzz which keeps timing out. > https://gitlab.com/jic23/qemu/-/pipelines/460109208 > There were a few tweaks to patch descriptions after I pushed that > out (I missed a few RB from Alex).
There's an RFC patch that needs review from core memory maintainers, so I guess not all of it is for merge just yet? Is there any way we can start applying this patchset gradually? > Other changes (mostly from Alex's review) > - Change component register handling to now report UNIMP and return 0 > for 8 byte registers as we currently don't implement any of them. > Note that this means we need a kernel fix: > > https://lore.kernel.org/linux-cxl/20220201153437.2873-1-jonathan.came...@huawei.com/ > - Drop majority of the macros used in defining mailbox handlers in > favour of written out code. > - Use REG64 where appropriate. This was introduced whilst this set > has been underdevelopment so I missed it. > - Clarify some register access options wrt to CXL 2.0 Errata F4. > - Change timestamp to qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) > - Use typed enums to enforce types of function arguements. > - Default to cxl being off in machine_class_init() removing > need to set it to off in machines where there is no support as yet. > - Add Alex's RB where given. > > Looking in particular for: > * Review of the PCI interactions > * x86 and ARM machine interactions (particularly the memory maps) > * Review of the interleaving approach - is the basic idea > acceptable? > * Review of the command line interface. > * CXL related review welcome but much of that got reviewed > in earlier versions and hasn't changed substantially. > > Big TODOs: > > * Interleave boundary issues. I haven't yet solved this but didn't > want to futher delay the review of the rest of the series. > > * Volatile memory devices (easy but it's more code so left for now). > * Switch support. Linux kernel support is under review currently, > so there is now something to test against. > * Hotplug? May not need much but it's not tested yet! > * More tests and tighter verification that values written to hardware > are actually valid - stuff that real hardware would check. > * Testing, testing and more testing. I have been running a basic > set of ARM and x86 tests on this, but there is always room for > more tests and greater automation. > * CFMWS flags as requested by Ben. > > Why do we want QEMU emulation of CXL? > > As Ben stated in V3, QEMU support has been critical to getting OS > software written given lack of availability of hardware supporting the > latest CXL features (coupled with very high demand for support being > ready in a timely fashion). What has become clear since Ben's v3 > is that situation is a continuous one. Whilst we can't talk about > them yet, CXL 3.0 features and OS support have been prototyped on > top of this support and a lot of the ongoing kernel work is being > tested against these patches. The kernel CXL mocking code allows > some forms of testing, but QEMU provides a more versatile and > exensible platform. > > Other features on the qemu-list that build on these include PCI-DOE > /CDAT support from the Avery Design team further showing how this > code is useful. Whilst not directly related this is also the test > platform for work on PCI IDE/CMA + related DMTF SPDM as CXL both > utilizes and extends those technologies and is likely to be an early > adopter. > Refs: > CMA Kernel: > https://lore.kernel.org/all/20210804161839.3492053-1-jonathan.came...@huawei.com/ > CMA Qemu: > https://lore.kernel.org/qemu-devel/1624665723-5169-1-git-send-email-cbr...@avery-design.com/ > DOE Qemu: > https://lore.kernel.org/qemu-devel/1623329999-15662-1-git-send-email-cbr...@avery-design.com/ > > As can be seen there is non trivial interaction with other areas of > Qemu, particularly PCI and keeping this set up to date is proving > a burden we'd rather do without :) > > Ben mentioned a few other good reasons in v3: > https://lore.kernel.org/qemu-devel/20210202005948.241655-1-ben.widaw...@intel.com/ > > The evolution of this series perhaps leave it in a less than > entirely obvious order and that may get tidied up in future postings. > I'm also open to this being considered in bite sized chunks. What > we have here is about what you need for it to be useful for testing > currently kernel code. Note the kernel code is moving fast so > since v4, some features have been introduced we don't yet support in > QEMU (e.g. use of the PCIe serial number extended capability). > > All comments welcome. > > qemu-system-aarch64 -M virt,gic-version=3,cxl=on \ > -m 4g,maxmem=8G,slots=8 \ > ... > -object > memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/cxltest.raw,size=256M \ > -object > memory-backend-file,id=cxl-mem2,share=on,mem-path=/tmp/cxltest2.raw,size=256M > \ > -object > memory-backend-file,id=cxl-mem3,share=on,mem-path=/tmp/cxltest3.raw,size=256M > \ > -object > memory-backend-file,id=cxl-mem4,share=on,mem-path=/tmp/cxltest4.raw,size=256M > \ > -object > memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/lsa.raw,size=256M \ > -object > memory-backend-file,id=cxl-lsa2,share=on,mem-path=/tmp/lsa2.raw,size=256M \ > -object > memory-backend-file,id=cxl-lsa3,share=on,mem-path=/tmp/lsa3.raw,size=256M \ > -object > memory-backend-file,id=cxl-lsa4,share=on,mem-path=/tmp/lsa4.raw,size=256M \ > -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \ > -device pxb-cxl,bus_nr=222,bus=pcie.0,id=cxl.2 \ > -device cxl-rp,port=0,bus=cxl.1,id=root_port13,chassis=0,slot=2 \ > -device > cxl-type3,bus=root_port13,memdev=cxl-mem1,lsa=cxl-lsa1,id=cxl-pmem0,size=256M > \ > -device cxl-rp,port=1,bus=cxl.1,id=root_port14,chassis=0,slot=3 \ > -device > cxl-type3,bus=root_port14,memdev=cxl-mem2,lsa=cxl-lsa2,id=cxl-pmem1,size=256M > \ > -device cxl-rp,port=0,bus=cxl.2,id=root_port15,chassis=0,slot=5 \ > -device > cxl-type3,bus=root_port15,memdev=cxl-mem3,lsa=cxl-lsa3,id=cxl-pmem2,size=256M > \ > -device cxl-rp,port=1,bus=cxl.2,id=root_port16,chassis=0,slot=6 \ > -device > cxl-type3,bus=root_port16,memdev=cxl-mem4,lsa=cxl-lsa4,id=cxl-pmem3,size=256M > \ > -cxl-fixed-memory-window targets=cxl.1,size=4G,interleave-granularity=8k \ > -cxl-fixed-memory-window > targets=cxl.1,targets=cxl.2,size=4G,interleave-granularity=8k > > First CFMWS suitable for up to 2 way interleave, the second for 4 way (2 way > at host level and 2 way at the host bridge). > targets=<range of pxb-cxl uids> , multiple entries if range is disjoint. > > With the v5.17-rc1 + patch series listed below. > > cd /sys/bus/cxl/devices/ > region=$(cat decoder0.1/create_region) > echo $region > decoder0.1/create_region > ls -lh > > //Note the order of devices and adjust the following to make sure they > //are in order across the 4 root ports. Easy to do in a tool, but > //not easy to paste in a cover letter. > > cd region0.1\:0 > echo 4 > interleave_ways > echo mem2 > target0 > echo mem3 > target1 > echo mem0 > target2 > echo mem1 > target3 > echo $((1024<<20)) > size > echo 4096 > interleave_granularity > echo region0.1:0 > /sys/bus/cxl/drivers/cxl_region/bind > > Tested with devmem2 and files with known content. > Kernel tree is mainline + (I based on 5.17-rc1) > [PATCH] cxl/regs: Fix size of CXL Capabilty Header Register > https://lore.kernel.org/linux-cxl/20220201182934.jjvavjsf4h7oq...@intel.com/T/#t > > [PATCH v3 00/40] CXL.mem Topology Discovery and Hotplug Support > https://lore.kernel.org/linux-cxl/164298411792.3018233.7493009997525360044.st...@dwillia2-desk3.amr.corp.intel.com/ > Note that series has a lot of v4/v5 patches are replies but b4 does > a good job of pulling out the latest. > > [PATCH 0/2] cxl/port: Robustness fixes for decoder enumeration > https://lore.kernel.org/linux-cxl/164317463887.3438644.4087819721493502301.st...@dwillia2-desk3.amr.corp.intel.com/ > > [PATCH 0/4] Unify meaning of interleave attributes > https://lore.kernel.org/linux-cxl/20220127212911.127741-1-ben.widaw...@intel.com/ > > [PATCH v3 00/14] CXL Region driver > https://lore.kernel.org/linux-cxl/20220128002707.391076-1-ben.widaw...@intel.com/ > > What follows is a first attempt at explaining how all these components > fit together. I'll write up some formal documentation shortly. > > Memory Address Map for CXL elements. Note where exactly these regions > appear is Arch and platform dependent. > > Base somewhere far up in the Host PA map. > _______________________________ > | | > | CXL Host Bridge 0 Registers | > | CXL Host Bridge 1 Registers | > | ... | This bit is normal MMIO register space. > | CXL Host bridge N registers | including programmable interleave decoders > |______________________________| for interleave across root ports. > | | > .... > | | > |______________________________| > | | > | CFMW 0, | Note that there can be multiple regions > | Interleave 2 way, targets | of memory within this 1TB which can be > | Hostbridge 0, Hostbridge 1 | interleaved differently: in the host bridges > | Granularity 16KiB, 1TB | across root ports or in switches below the > root. > |______________________________| ports > | | > | CFMW 1, | > | Interleave 1 way, target | > | Hostbridge 0, 512GiB | > |______________________________| > etc for all interleave combinations > configured, or built in to the > system before any generic software > sees it. > > System Topology considering CFMW 0 only to keep this simple. > x marks the match in each decoder level. > Switches have more interleave decoders and other features > that we haven't implemented yet in QEMU. > > Address Read to CFMW0 base + N > _________________|________________ > | | > | Host interconnect | > | Configured to route CFM | > | memory access to particular HB | > |_____x____________________________| > | | > Interleave Decoder | > Matches this HB | > | | > _______|__________ _____|____________ > | | | | > | CXL HB 0 | | CXL HB 1 | Only exist in PCI > (mostly) > | HB IntLv Decoder | | HB IntLv Decoder | via ACPI description > | PCI Root Bus 0c | | PCI Root Bus 0d | > |x_________________| |__________________| In CXL have MMIO > | | | | at location given in > CEDT > | | | | CHBS entry (ACPI) > ____________|___ __________|__ __|_________ ___|_________ > | Root Port 0 | | Root Port 1 | | Root Port 2| | Root Port 3 | > | Appears in | | Appears in | | Appears in | | Appear in | > | PCI topology | | PCI Topology| | PCI Topo | | PCI Topo | > | As 0c:00.0 | | as 0c:01.0 | | as de:00.0 | | as de:01.0 | > |_______________| |_____________| |____________| |_____________| > | | | | > | | | | > _____|_________ ______|______ ______|_____ ______|_______ > | x | | | | | | | > | CXL Type3 0 | | CXL Type3 1 | | CXL type3 2| | CLX Type 3 3 | > | | | | | | | | > | PMEM0(Vol LSA)| | PMEM1 (...) | | PMEM2 (...)| | PMEM3 (...) | > | Decoder to go | | | | | | | > | from host PA | | PCI 0e:00.0 | | PCI df:00.0| | PCI e0:00.0 | > | to device PA | | | | | | | > | PCI as 0d:00.0| | | | | | | > |_______________| |_____________| |____________| |______________| > > Backed by Backed by Backed by Backed by > file 0 file 1 file 2 file 3 > > LSA backed by additional files for each device (not yet supported) > > So currently we have decoders as follows for each interleaved access. > 1) CFMW decoder - fixed config so forms part of qemu command line. > 2) Host bridge decoders - programmable decoders that the system > software will program either based on user command or based > on info from the Label Storage Area (not yet emulated) > 3) Type 3 device decoders. Down to here the address used is the > Host PA. These decoders convert to the local device PA > (in simple case - drop some bits in the middle of the address) > > Future patches will add decoders in switch upstream ports making > the above diagram have another layer between root ports and > the memory devices. > > Note, we've focused for now on Persistent Memory devices as they are seen > as an early and important usecase (and are the most complex one). > But it should be straight forward to add volatile memory > support and indeed that would be backed by RAM. > > lspci -tv for above shows > > -+-[0000:00]-+-00.0 Red Hat, Inc. QEMU PCIe Host Bridge (this is the cxl PXB)f > | \-OTHER STUFF > +-[0000:0c]-+-00.0-[0d]----00.0 Intel Corporation Device 0d93 > | \-01.0-[0e]----00.0 Intel Corporation Device 0d93 > \-[0000:de]-+-00.0-[df]----00.0 Intel Corporation Device 0d93 > \-01.0-[e0]----00.0 Intel Corporation Device 0d93 > > Where those Intel parts are the type 3 devices. > > All comments welcome! > > Particular thanks to Alex Bennée for his review of v4. > > Thanks, > > Jonathan > > Ben Widawsky (26): > hw/pci/cxl: Add a CXL component type (interface) > hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5) > hw/cxl/device: Introduce a CXL device (8.2.8) > hw/cxl/device: Implement the CAP array (8.2.8.1-2) > hw/cxl/device: Implement basic mailbox (8.2.8.4) > hw/cxl/device: Add memory device utilities > hw/cxl/device: Add cheap EVENTS implementation (8.2.9.1) > hw/cxl/device: Timestamp implementation (8.2.9.3) > hw/cxl/device: Add log commands (8.2.9.4) + CEL > hw/pxb: Use a type for realizing expanders > hw/pci/cxl: Create a CXL bus type > hw/pxb: Allow creation of a CXL PXB (host bridge) > acpi/pci: Consolidate host bridge setup > hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142) > hw/cxl/rp: Add a root port > hw/cxl/device: Add a memory device (8.2.8.5) > hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12) > acpi/cxl: Add _OSC implementation (9.14.2) > tests/acpi: allow CEDT table addition > acpi/cxl: Create the CEDT (9.14.1) > hw/cxl/device: Add some trivial commands > hw/cxl/device: Plumb real Label Storage Area (LSA) sizing > hw/cxl/device: Implement get/set Label Storage Area (LSA) > acpi/cxl: Introduce CFMWS structures in CEDT > hw/cxl/component Add a dumb HDM decoder handler > qtest/cxl: Add very basic sanity tests > > Jonathan Cameron (17): > MAINTAINERS: Add entry for Compute Express Link Emulation > tests/acpi: allow DSDT.viot table changes. > tests/acpi: Add update DSDT.viot > cxl: Machine level control on whether CXL support is enabled > hw/cxl/component: Add utils for interleave parameter encoding/decoding > hw/cxl/host: Add support for CXL Fixed Memory Windows. > hw/pci-host/gpex-acpi: Add support for dsdt construction for pxb-cxl > pci/pcie_port: Add pci_find_port_by_pn() > CXL/cxl_component: Add cxl_get_hb_cstate() > mem/cxl_type3: Add read and write functions for associated hostmem. > cxl/cxl-host: Add memops for CFMWS region. > arm/virt: Allow virt/CEDT creation > hw/arm/virt: Basic CXL enablement on pci_expander_bridge instances > pxb-cxl > RFC: softmmu/memory: Add ops to memory_region_ram_init_from_file > i386/pc: Enable CXL fixed memory windows > qtest/acpi: Add reference CEDT tables. > scripts/device-crash-test: Add exception for pxb-cxl > > MAINTAINERS | 7 + > hw/Kconfig | 1 + > hw/acpi/Kconfig | 5 + > hw/acpi/cxl-stub.c | 12 + > hw/acpi/cxl.c | 231 +++++++++++++ > hw/acpi/meson.build | 4 +- > hw/arm/Kconfig | 1 + > hw/arm/virt-acpi-build.c | 30 ++ > hw/arm/virt.c | 40 ++- > hw/core/machine.c | 28 ++ > hw/cxl/Kconfig | 3 + > hw/cxl/cxl-component-utils.c | 284 ++++++++++++++++ > hw/cxl/cxl-device-utils.c | 271 ++++++++++++++++ > hw/cxl/cxl-host-stubs.c | 22 ++ > hw/cxl/cxl-host.c | 263 +++++++++++++++ > hw/cxl/cxl-mailbox-utils.c | 483 ++++++++++++++++++++++++++++ > hw/cxl/meson.build | 12 + > hw/i386/acpi-build.c | 98 ++++-- > hw/i386/pc.c | 57 +++- > hw/mem/Kconfig | 5 + > hw/mem/cxl_type3.c | 353 ++++++++++++++++++++ > hw/mem/meson.build | 1 + > hw/meson.build | 1 + > hw/pci-bridge/Kconfig | 5 + > hw/pci-bridge/cxl_root_port.c | 231 +++++++++++++ > hw/pci-bridge/meson.build | 1 + > hw/pci-bridge/pci_expander_bridge.c | 171 +++++++++- > hw/pci-bridge/pcie_root_port.c | 6 +- > hw/pci-host/gpex-acpi.c | 22 +- > hw/pci/pci.c | 21 +- > hw/pci/pcie_port.c | 25 ++ > include/hw/acpi/cxl.h | 28 ++ > include/hw/arm/virt.h | 1 + > include/hw/boards.h | 2 + > include/hw/cxl/cxl.h | 51 +++ > include/hw/cxl/cxl_component.h | 206 ++++++++++++ > include/hw/cxl/cxl_device.h | 272 ++++++++++++++++ > include/hw/cxl/cxl_pci.h | 160 +++++++++ > include/hw/pci/pci.h | 14 + > include/hw/pci/pci_bridge.h | 20 ++ > include/hw/pci/pci_bus.h | 7 + > include/hw/pci/pci_ids.h | 1 + > include/hw/pci/pcie_port.h | 2 + > qapi/machine.json | 15 + > qemu-options.hx | 37 +++ > scripts/device-crash-test | 1 + > softmmu/memory.c | 9 + > softmmu/vl.c | 11 + > tests/data/acpi/pc/CEDT | Bin 0 -> 36 bytes > tests/data/acpi/q35/CEDT | Bin 0 -> 36 bytes > tests/data/acpi/q35/DSDT.viot | Bin 9398 -> 9416 bytes > tests/data/acpi/virt/CEDT | Bin 0 -> 36 bytes > tests/qtest/cxl-test.c | 151 +++++++++ > tests/qtest/meson.build | 4 + > 54 files changed, 3645 insertions(+), 41 deletions(-) > create mode 100644 hw/acpi/cxl-stub.c > create mode 100644 hw/acpi/cxl.c > create mode 100644 hw/cxl/Kconfig > create mode 100644 hw/cxl/cxl-component-utils.c > create mode 100644 hw/cxl/cxl-device-utils.c > create mode 100644 hw/cxl/cxl-host-stubs.c > create mode 100644 hw/cxl/cxl-host.c > create mode 100644 hw/cxl/cxl-mailbox-utils.c > create mode 100644 hw/cxl/meson.build > create mode 100644 hw/mem/cxl_type3.c > create mode 100644 hw/pci-bridge/cxl_root_port.c > create mode 100644 include/hw/acpi/cxl.h > create mode 100644 include/hw/cxl/cxl.h > create mode 100644 include/hw/cxl/cxl_component.h > create mode 100644 include/hw/cxl/cxl_device.h > create mode 100644 include/hw/cxl/cxl_pci.h > create mode 100644 tests/data/acpi/pc/CEDT > create mode 100644 tests/data/acpi/q35/CEDT > create mode 100644 tests/data/acpi/virt/CEDT > create mode 100644 tests/qtest/cxl-test.c > > -- > 2.32.0