Ok, so circling back to this as I'm still having issues on the deployed
boxes. I suspect from the e-mails below that this is caused by changes
in the host time not making it into OSv.

This seems to be related to KVM_WALL_CLOCK_NEW not being updated. The
docs say:

        data: 4-byte alignment physical address of a memory area which
must be
        in guest RAM. This memory is expected to hold a copy of the
following
        structure:

        struct pvclock_wall_clock {
                u32   version;
                u32   sec;
                u32   nsec;
        } __attribute__((__packed__));

        whose data will be filled in by the hypervisor. The hypervisor
is only
        guaranteed to update this data at the moment of MSR write.
        Users that want to reliably query this information more than
once have
        to write more than once to this MSR.

OSv sets the MSR once, and then uses the result repeatedly - in fact it
barrier()s it which is odd as we know it shouldn't change unless we
write the MSR - which makes me think there was some misunderstanding
here (which could easily be on my part!).

Indeed, when I use 'date' to grossly change the host time, the OSv
guest is not seeing this. However, if I modify pvclock::wall_clock_boot
to write the MSR each time, then it gets the change and my OSv guest
stays in-sync with the host.

So do I make the changes to do this each time - not sure of the
cost/penalty of doing this. Or am I missing something else?

How does ntp slew time - does it just set the host time incrementally,
or is there another interface its using (can't easily check, offshore
at present).

Cheers,
Rick

On Mon, 2018-09-17 at 11:17 +0300, Nadav Har'El wrote:
> 
> On Mon, Sep 17, 2018 at 4:20 AM, Rick Payne <ri...@rossfell.co.uk>
> wrote:
> > On Mon, 2018-09-17 at 01:41 +0300, Nadav Har'El wrote:
> > > I have a wild guess, but I'm not a big clock expert, and I'm
> > CCing
> > > Glauber who might have better ideas.
> > > 
> > > My guess is that you have ntp running in the *host*, but not in
> > the
> > > guest (we don't have an ntp client for OSv), and somehow this is
> > > causing this drift in wall-clock time between the two. My guess
> > (and
> > > again, this is a guess, I don't know if that's true), that the
> > > adjtime / adjtimex / ntp_adjtime or whatever system call that ntp
> > > uses to gradually adjust the time in the host, doesn't cause the
> > same
> > > adjustment to be propagated to the guest by the paravirtual clock
> > > mechanism (which probably relies on the clock frequency being
> > fixed,
> > > while adjtime tweaks it a bit).
> > 
> > So I'm on a boat, and no NTP server - however you seem to have
> > nailed
> > it.
> 
> On one hand, it's great that we understand this now, but on the other
> hand
> it's very sad that although the host keeps accurate time (or at least
> thinks
> that it does), it cannot just pass it to the guest (via kvm-clock) as
> we always
> implicitly assumed that it does.
> 
> It seems to me (but again, I'm not an expert on this), that if QEMU
> or KVM is unable
> to track the host ntp's adjtime() modifications, it needs to modify
> the *wall clock*
> value periodically to track the host's changing notion of how long
> ago the epoch was.
> 
> I suspect that this issue is not specific to OSv guests, and will
> also occur on
> Linux guests which do not run ntpd inside them. If this is indeed the
> case (and
> it would be great if you could verify this), I think we should ask
> from advice
> from the KVM experts on the KVM mailing list, what can be done.
> Popular
> wisdom on the web suggests that you must run an ntp client on your
> guest as
> well, and with some effort we can get some ntp client (e.g., chrony
> working on OSv).
> But in the long run, that would be sad for KVM - if KVM has the
> opportunity
> to pass the guest a perfectly accurate clock (based on ntp running in
> the
> host) it would be a waste not to seize that opportunity, and I wonder
> if there's
> a reason why not.
> 
>  
> > > 
> > > You can verify this guess by stopping the ntpd/chronyd demon in
> > the
> > > host and seeing if the drift remains or goes away.
> > 
> > I turned off the systemd timesync (timedatectl set-ntp off) and now
> > it
> > all works perfectly.
> 
> Wow, it's so sad to see that systemd took over yet another stand-
> alone
> daemon. Yet another nail the coffin of the Unix philosophy :-(
> 
> 

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to