On Wed, 5 Nov 2025 10:40:10 +1000 Gavin Shan <[email protected]> wrote:
> Hi Jonathan and Igor, > > On 11/4/25 10:21 PM, Jonathan Cameron wrote: > > On Mon, 3 Nov 2025 10:52:16 +0100 > > Igor Mammedov <[email protected]> wrote: > > > > [...] > > >> My idea using per cpu source is just a speculation based on spec > >> on how workaround the problem, > >> I don't really know if guest OS will be able to handle it (aka, > >> need to be tested is it's viable). That also probably was a reason > >> in previous review, why should've waited for multiple sources > >> support be be merged first before this series. > > > > Per vCPU should work fine but I do like the approach here of reporting > > all the related errors in one go as they represent the underlying nature > > of the error granularity tracking. If anyone ever poisons at the 1GiB level > > on the host they are on their own - so I think that it will only ever be > > the finest granularity supported (so worse case 64KiB). > > > > Well, I don't have strong opinions, but I intend to agree with Jonathan > to report all 16x errors at once. One reason is one as Jonathan mentioned. > Another reason is per vCPU error source is a bit heavy for the improvement. > > So I'm going to improve (v2) series to address all received comments and > post a (v3) series. > > I already had the prototype of error source per vcpu, which works fine for > 64KB-host-4KB-guest. However, it doesn't work for huge pages. For example, > a problematic 512MB huge page can cause heavy memory error storm to QEMU > where we absolutely can't handle. > > 1. Start the VM with hugetlb pages > > /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 > \ > -accel kvm -machine virt,gic-version=host,nvdimm=on,ras=on > \ > -cpu host -smp maxcpus=8,cpus=8,sockets=2,clusters=2,cores=2,threads=1 > \ > -m 4096M,slots=16,maxmem=128G > \ > -object > memory-backend-file,id=mem0,prealloc=on,mem-path=/dev/hugepages-524288kB,size=4096M > \ > -numa node,nodeid=0,cpus=0-7,memdev=mem0 > \ > > 2. Run 'victim -d' on guest > > guest$ ./victim -d > physical address of (0xffff889d6000) = 0x11a7da000 > Hit any key to trigger error: > > 3. Inject error from host > > host$ errinjct 0x11a7da000 > > 4. QEMU crashes with error message "Bus error (core dumped)", which is > triggered > the following path. > > sigbus_handler > kvm_on_sigbus_vcpu // have_sigbus_pending = 1 > sigbus_reraise To me this sounds like something that should not be happening on the host unless a real memory error is detected that blows away the whole of / most of a huge page. I'm not sure we care about surviving that case if it isn't mapped using hugetlb/DAX or similar in the guest (so contiguous in both with contained impact in both). I assume the issue is backing with hugetlbfs which doesn't have a sub huge page granularity for poison tracking. I vaguely recall an effort to solve that https://lore.kernel.org/linux-mm/[email protected]/ was the first thing google threw me. Looks like it got to v2. https://lore.kernel.org/linux-mm/[email protected]/ +CC James. > > Thanks, > Gavin > >
