On Tue, Oct 22, 2019 at 04:25:19PM -0700, [email protected] wrote:
> On Tue, 22 Oct 2019, Andreas Rottmann wrote:
> > >Synopsis:  panic: pvclock0: unstable result on stable clock
> > >Category:  virtualization
> > >Environment:
> >     System      : OpenBSD 6.6
> >     Details     : OpenBSD 6.6 (GENERIC.MP) #372: Sat Oct 12 10:56:27 MDT 
> > 2019
> >                      
> > [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> > 
> >     Architecture: OpenBSD.amd64
> >     Machine     : amd64
> > >Description:
> > 
> > I've just experienced a kernel panic when resuming my laptop from 
> > suspend-to-RAM while my OpenBSD 6.6 VM was running; the first few lines 
> > of the crash read like this:
> > 
> > panic: pvclock0: unstable result on stable clock
> > Stopped at      db_enter+0x10:  popq    %rbp
> >     TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
> > db_enter() at db_enter+0x10
> > panic() at panic+0x128
> > pvclock_get_timecount(ffffffff81f14360) at pvclock_get_timecount+0xc2
> > 
> > The full ddb session, including backtraces for both cores, and the `ps`
> > output is attached as `ddb.txt`.
> 
> So the immediate code of the panic is this:
>         /* This bit must be set as we attached based on the stable flag */
>         if ((flags & PVCLOCK_FLAG_TSC_STABLE) == 0)
>                 panic("%s: unstable result on stable clock", DEVNAME(sc));
> 
> That is, the pvclock driver currently assumes that if it advertises a 
> stable clock when the OpenBSD guest is booted, then it'll remain stable 
> forever.  That apparently is not a safe assumption across a suspend/resume 
> cycle in the Linux/KVM host.
> 

It probably also isn't a safe assumption in a live migration scenario,
either, if you're correct above.

-ml

> To fix this, the driver would have to get the system to stop using it as 
> the active timecounter whenever its marked instable.  Perhaps it could 
> just adjust its quality (sc->sc_tc->tc_quality) downward while that's the 
> case?  I'm not sure if that would be enough, but you could try 
> implementing that.
> 
> Lacking that, I guess you'll want to have KVM stop the guest before you 
> suspend the host, and then on resume wait a bit until the clock 
> settles--not sure how long that takes or how you would know--before 
> restarting the guest.
> 
> 
> Philip Guenther
> 

Reply via email to