On 21/09/20 04:22, zhenwei pi wrote: > Hi, > > A patchset about handling 'MCE' might have been ignored, can anyone tell > me whether the purpose is reasonable? > > https://patchwork.kernel.org/cover/11773795/
Yes, it's very useful. Just one thing, "guest-mce" can be reported for both AR and AO faults. Is it worth adding a 'type' field to distinguish the two? Paolo > On 9/14/20 9:43 PM, zhenwei pi wrote: >> Although QEMU could catch signal BUS to handle hardware memory >> corrupted event, sadly, QEMU just prints a little log and try to fix >> it silently. >> >> In these patches, introduce a 'MEMORY_FAILURE' event with 4 detailed >> actions of QEMU, then uplayer could know what situaction QEMU hit and >> did. And further step we can do: if a host server hits a >> 'hypervisor-ignore' >> or 'guest-mce', scheduler could migrate VM to another host; if hitting >> 'hypervisor-stop' or 'guest-triple-fault', scheduler could select other >> healthy servers to launch VM. >> >> zhenwei pi (3): >> target-i386: seperate MCIP & MCE_MASK error reason >> iqapi/run-state.json: introduce memory failure event >> target-i386: post memory failure event to uplayer >> >> qapi/run-state.json | 46 >> ++++++++++++++++++++++++++++++++++++++++++++++ >> target/i386/helper.c | 30 +++++++++++++++++++++++------- >> target/i386/kvm.c | 5 ++++- >> 3 files changed, 73 insertions(+), 8 deletions(-) >> >