On 8/10/25 9:11 AM, Tao Tang wrote:

On 2025/8/7 05:28, Pierrick Bouvier wrote:
On 8/6/25 8:11 AM, Tao Tang wrote:
Hi all,

This patch series introduces initial support for emulating the Arm
SMMUv3
Secure State.

As Pierrick pointed out in a previous discussion [1], full Secure SMMUv3
emulation is a notable missing piece in QEMU. While the FVP model has
some support, its limited PCIe capabilities make it challenging for
complex use cases. The ability to properly manage device DMA from a
secure context is a critical prerequisite for enabling device assignment
(passthrough) for confidential computing solutions like Arm CCA and
related research such as virtCCA [2]. This series aims to build that
foundational support in QEMU.


Thanks for posting this series, it's definitely an important piece
missing for emulating newer SMMU versions.

This work is being proposed as an RFC. It introduces a significant
amount
of new logic, including the core concept of modeling parallel secure and
non-secure contexts within a single SMMUv3 device. I am seeking feedback
on the overall approach, the core refactoring, and the implementation
details before proceeding further.

The series begins by implementing the components of the secure
programming
interface, then progressively refactors the core SMMU logic to handle
secure and non-secure contexts in parallel.

Secure Interface Implementation: The initial patches add the
secure-side registers, implement their read/write logic, and enable
the secure command and event queues. This includes the S_INIT
mechanism and the new secure TLB invalidation commands.

Core Logic Refactoring: The next set of patches makes the core SMMU
functions security-state aware. This involves plumbing an is_secure
context flag through the main code paths and adding logic to route
SMMU-originated memory accesses to the correct (Secure or Non-secure)
address space.

Cache Isolation: With the core logic now aware of security states,
the following patches refactor the configuration and translation
lookup caches. The cache keys are modified to include the security
context, ensuring that secure and non-secure entries for the same
device or address are properly isolated and preventing aliasing.

Framework Integration: The final patch connects the SMMU's internal
security context to the generic QEMU IOMMU framework by using the
iommu_index to represent the architectural SEC_SID.

To validate this work, I performed the following tests:

Non-Secure Regression: To ensure that existing functionality remains
intact, I ran a nested virtualization test. A TCG guest was created on
the host, with iommu=smmuv3 and with an emulated PCIe NVMe device
assigned.
Command line of TCG VM is below:

qemu-system-aarch64 \
-machine virt,virtualization=on,gic-version=3,iommu=smmuv3 \
-cpu max -smp 1 -m 4080M \
-accel tcg,thread=single,tb-size=512 \
-kernel Image \
-append 'nokaslr root=/dev/vda rw rootfstype=ext4
iommu.passthrough=on' \
-device
pcie-root-port,bus=pcie.0,id=rp0,addr=0x4.0,chassis=1,port=0x10 \
-device
pcie-root-port,bus=pcie.0,id=rp1,addr=0x5.0,chassis=2,port=0x11 \
-drive if=none,file=u2204fs.img.qcow2,format=qcow2,id=hd0 \
-device virtio-blk-device,drive=hd0 \
-qmp unix:/tmp/qmp-sock12,server=on,wait=off \
-netdev user,id=eth0,hostfwd=tcp::10022-:22,hostfwd=tcp::59922-:5922 \
-device virtio-net-device,netdev=eth0 \
-drive if=none,file=nvme.img,format=raw,id=nvme0 \
-device nvme,drive=nvme0,serial=deadbeef \
-d unimp,guest_errors -trace events=smmu-events.txt -D qemu.log
-nographic

Inside this TCG VM, a KVM guest was launched, and the same NVMe
device was
re-assigned to it via VFIO.
Command line of KVM VM inside TCG VM is below:

sudo qemu-system-aarch64  \
-enable-kvm  -m 1024  -cpu host  -M virt \
-machine virt,gic-version=3 \
-cpu max -append "nokaslr" -smp 1 \
-monitor stdio \
-kernel 5.15.Image \
-initrd rootfs.cpio.gz \
-display vnc=:22,id=primary \
-device vfio-pci,host=00:01.0

The KVM guest was able to perform I/O on the device
correctly, confirming that the non-secure path is not broken.

Secure Register/Command Interface: I set up an OP-TEE + Hafnium
environment. Hafnium's smmuv3_driver_init function was used to test
the secure register I/O and command queue functionality (excluding
translation). As Hafnium assumes larger queue and StreamID sizes than
are practical without TTST support, I temporarily patched Hafnium to
use smaller values, allowing its driver to initialize the emulated
secure SMMU successfully.


Would that be possible to share your changes, and build instructions
for this? While working on SMMU emulation, we finally left this on the
side due to lack of a software stack being able to use secure SMMU, as
we were not aware that Hafnium + op-tee could make use of it.

Hi Pierrick,

Thanks for your interest! I'm very happy to share my work on this. I've
documented the setup process, including our code modifications and the
step-by-step build instructions in  this link:

https://hnusdr.github.io/2025/08/09/Test-Secure-SMMU-with-Hafnium-ENG


Thanks for taking the time to assemble all this in a comprehensible post, I'll give it a try when I have some spare time.


The core point of these changes is to enable the SMMUv3 feature in
Hafnium. This leads to numerous read/write operations on SMMUv3 secure
registers and various queue manipulations within the smmuv3_driver_init
function in Hafnium.

However, it's important to note that this initialization process itself
does not initiate any DMA memory access that would trigger the
smmuv3_translate flow.


I understand the difference. It can be tricky to generate specific translation scenarios, which is where a custom test device can really help.

Even so, we've devised a method to test the subsequent Secure
Translation Path by leveraging the smmuv3-test platform device. This
approach allows us to verify the entire SMMUv3 flow, from initialization
to translation.


Does it rely on a custom driver integration into an existing firmware or the kernel?


Secure Translation Path: Since the TF-A SMMUv3 Test Engine does not
support QEMU, and no secure device assignment feature exists yet, I
created a custom platform device to test the secure translation flow.
To trigger the translation logic, I initiated MMIO writes to this
device from within Hafnium. The device's MMIO callback handler then
performed DMA accesses via its IOMMU region, exercising the secure
translation path. While SMMUv3 is typically used for PCIe on
physical SoCs, the architecture allows its use with platform devices
via a stream-id binding in the device tree. The test harness
required some non-standard modifications to decouple the SMMU from
its tight integration with PCIe. The code for this test device is
available for review at [3]. README.md with detailed instructions is
also provided.


I am not sure about the current policy in QEMU for test oriented
devices, but it would be really useful to have something similar
upstream (Note: it's out of the scope of this series).
One challenge working with SMMU emulation is that reproducing setups
and triggering specific code paths is hard to achieve, due to the
indirect use of SMMU feature (through DMA) and the complex software
stack usually involved.
Having something upstream available to work on SMMU emulation, at
least on device side, would be a great addition.

Eric, Peter, is this something that would be acceptable to merge?


Looking ahead, my plan is to refactor the smmuv3-test platform device.
The goal is to make it self-contained within QEMU, removing the current
dependency on Hafnium to trigger its operations. I plan to submit this
as a separate RFC patch series in the next few days.


This is very welcome. Once this is in place, it would be great to add a new test to make sure things don't regress, and from where we can iterate.
By self-contained within QEMU, do you mean a QTest based test?

Regards,
Pierrick

Reply via email to