On Fri, Aug 27, 2021, Mathieu Desnoyers wrote:
> > So there are effectively three reasons we want a delay:
> > 
> >  1. To allow sched_setaffinity() to coincide with ioctl(KVM_RUN) before KVM 
> > can
> >     enter the guest so that the guest doesn't need an arch-specific VM-Exit 
> > source.
> > 
> >  2. To let ioctl(KVM_RUN) make its way back to the test before the next 
> > round
> >     of migration.
> > 
> >  3. To ensure the read-side can make forward progress, e.g. if 
> > sched_getcpu()
> >     involves a syscall.
> > 
> > 
> > After looking at KVM for arm64 and s390, #1 is a bit tenuous because x86 is 
> > the
> > only arch that currently uses xfer_to_guest_mode_work(), i.e. the test 
> > could be
> > tweaked to be overtly x86-specific.  But since a delay is needed for #2 and 
> > #3,
> > I'd prefer to rely on it for #1 as well in the hopes that this test provides
> > coverage for arm64 and/or s390 if they're ever converted to use the common
> > xfer_to_guest_mode_work().
> 
> Now that we have this understanding of why we need the delay, it would be 
> good to
> write this down in a comment within the test.

Ya, I'll get a new version out next week.

> Does it reproduce if we randomize the delay to have it picked randomly from 
> 0us
> to 100us (with 1us step) ? It would remove a lot of the needs for 
> arch-specific
> magic delay value.

My less-than-scientific testing shows that it can reproduce at delays up to 
~500us,
but above ~10us the reproducibility starts to drop.  The bug still reproduces
reliably, it just takes more iterations, and obviously the test runs a bit 
slower.

Any objection to using a 1-10us delay, e.g. a simple usleep((i % 10) + 1)?

Reply via email to