On Thu, Jun 25, 2020 at 08:34:14PM +0200, Paolo Bonzini wrote: > On 25/06/20 17:57, Roman Bolshakov wrote: > > So, the kick is not delivered to self and in case if destination cpu is > > not running. I think it can't interrupt subsequent hv_vcpu_run. > > Yes. > > >> If not, you can reduce a bit the race window by setting a variable in > >> cpu, like > >> > >> atomic_set(&cpu->deadline, 0); > >> hv_vcpu_interrupt(...) > >> > >> and in the vCPU thread > >> > >> hv_vcpu_run_until(..., atomic_read(&cpu->deadline)); > >> atomic_set(&cpu->deadline, HV_DEADLINE_FOREVER); > >> > > > > Sure, could you please explain who'll be racing? There's a race if a > > kick was sent after VMEXIT, right? So essentially we need a way to > > "requeue" a kick that was received outside of hv_vcpu_run to avoid loss > > of it? > > Yes. Note that this is not a new bug, it's pre-existing and it's common > to all hypervisors except KVM/WHPX. I mean not the QEMU code, it's the > kernel APIs that are broken. :) > > One way to do so is to keep the signal, and have the signal handler > enable the preemption timer (with a deadline of 0) in the pin-based > interrupt controls. Hopefully macOS allows that, especially on 10.15+ > where hv_vcpu_run_until probably uses the preemption timer. > > > hv_vcpu_run_until is only available on macOS 10.15+ and we can't use yet > > because of three release support rule. > > (https://developer.apple.com/documentation/hypervisor/3181548-hv_vcpu_run_until?language=objc) > > > > BTW, I'm totally okay to send v2 if kicks are lost and/or the patch > > needs improvements. (and I can address EFER to VMCS Entry Controls > > synchronization as well) > > > > Paolo, do you know any particular test in kvm-unit-tests that can > > exhibit the issue? > > No, it's a race and it's extremely rare, but I point it out because it's > a kernel issue that Apple might want to fix anyway. It might also be > (depending on how the kernel side is written) that the next scheduler > tick will end up unblocking the vCPU and papering over it. >
Hi Paolo, I implemented what you proposed using VMX-preemption timer in Pin-based controls and regular hv_vcpu_run(). It works fine without noticable regressions, I'll send that in v2. hv_vcpu_run_until() was also evaluated on macOS 10.15.5 but it degrades VM performance significantly compared to explicit setting of VMX-preepmtion timer value and hv_vcpu_run(). The performance issue was observed on Broadwell-based MacBook Air and Ivy Bridge-based MacBook Pro. macOS 11.0 Beta deprecated hv_vcpu_run() and introduced a special declaration for hv_vcpu_run_until(), that's not available 10.15 - HV_DEADLINE_FOREVER (UINT64_MAX, which is bigger than maximum value of VMX-preeemption counter). Perhaps the performance issue is addressed there. Regards, Roman