Re: OSv time drifting when running under KVM

2018-10-22 Thread Nadav Har'El
On Sat, Oct 20, 2018 at 4:27 AM Rick Payne  wrote:

>
> OSv sets the MSR once, and then uses the result repeatedly - in fact it
> barrier()s it which is odd as we know it shouldn't change unless we
> write the MSR - which makes me think there was some misunderstanding
> here (which could easily be on my part!).
>

Very interesting - I think it is indeed your understand which was correct.
Surprising nobody ever noticed this bug in OSv, including Glauber who
wrote the KVM docs you quoted :-)


> So do I make the changes to do this each time - not sure of the
> cost/penalty of doing this. Or am I missing something else?
>

You can measure the cost of this - try a loop of C's time() for example,
before and after your patch, and see what the difference is. I think
there might be a fairly significant difference, because an MSR requires
an exit :-(

At least the uptime() functions we need for internal uses such as the
scheduler, don't have this MSR usage.


> How does ntp slew time - does it just set the host time incrementally,
> or is there another interface its using (can't easily check, offshore
> at present).
>

It's been years since I looked at NTP source code, but I remember that
Linux added a adjtime() function to gradually adjust the wall clock in
small steps. Theoretically, Linux could, whenever such a time step takes
effect, also tell KVM and it will write to all the wall addresses previously
subscribes. But evidently it doesn't. I don't know why, or if such an idea
was ever considered.

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: OSv time drifting when running under KVM

2018-09-17 Thread Nadav Har'El
On Mon, Sep 17, 2018 at 4:20 AM, Rick Payne  wrote:

> On Mon, 2018-09-17 at 01:41 +0300, Nadav Har'El wrote:
> > I have a wild guess, but I'm not a big clock expert, and I'm CCing
> > Glauber who might have better ideas.
> >
> > My guess is that you have ntp running in the *host*, but not in the
> > guest (we don't have an ntp client for OSv), and somehow this is
> > causing this drift in wall-clock time between the two. My guess (and
> > again, this is a guess, I don't know if that's true), that the
> > adjtime / adjtimex / ntp_adjtime or whatever system call that ntp
> > uses to gradually adjust the time in the host, doesn't cause the same
> > adjustment to be propagated to the guest by the paravirtual clock
> > mechanism (which probably relies on the clock frequency being fixed,
> > while adjtime tweaks it a bit).
>
> So I'm on a boat, and no NTP server - however you seem to have nailed
> it.
>

On one hand, it's great that we understand this now, but on the other hand
it's very sad that although the host keeps accurate time (or at least thinks
that it does), it cannot just pass it to the guest (via kvm-clock) as we
always
implicitly assumed that it does.

It seems to me (but again, I'm not an expert on this), that if QEMU or KVM
is unable
to track the host ntp's adjtime() modifications, it needs to modify the
*wall clock*
value periodically to track the host's changing notion of how long ago the
epoch was.

I suspect that this issue is not specific to OSv guests, and will also
occur on
Linux guests which do not run ntpd inside them. If this is indeed the case
(and
it would be great if you could verify this), I think we should ask from
advice
from the KVM experts on the KVM mailing list, what can be done. Popular
wisdom on the web suggests that you must run an ntp client on your guest as
well, and with some effort we can get some ntp client (e.g., chrony working
on OSv).
But in the long run, that would be sad for KVM - if KVM has the opportunity
to pass the guest a perfectly accurate clock (based on ntp running in the
host) it would be a waste not to seize that opportunity, and I wonder if
there's
a reason why not.



>
> >
> > You can verify this guess by stopping the ntpd/chronyd demon in the
> > host and seeing if the drift remains or goes away.
>
> I turned off the systemd timesync (timedatectl set-ntp off) and now it
> all works perfectly.
>

Wow, it's so sad to see that systemd took over yet another stand-alone
daemon. Yet another nail the coffin of the Unix philosophy :-(

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: OSv time drifting when running under KVM

2018-09-16 Thread Rick Payne
On Mon, 2018-09-17 at 01:41 +0300, Nadav Har'El wrote:
> I have a wild guess, but I'm not a big clock expert, and I'm CCing
> Glauber who might have better ideas.
> 
> My guess is that you have ntp running in the *host*, but not in the
> guest (we don't have an ntp client for OSv), and somehow this is
> causing this drift in wall-clock time between the two. My guess (and
> again, this is a guess, I don't know if that's true), that the
> adjtime / adjtimex / ntp_adjtime or whatever system call that ntp
> uses to gradually adjust the time in the host, doesn't cause the same
> adjustment to be propagated to the guest by the paravirtual clock
> mechanism (which probably relies on the clock frequency being fixed,
> while adjtime tweaks it a bit).

So I'm on a boat, and no NTP server - however you seem to have nailed
it.

> 
> You can verify this guess by stopping the ntpd/chronyd demon in the
> host and seeing if the drift remains or goes away.

I turned off the systemd timesync (timedatectl set-ntp off) and now it
all works perfectly.

> I tried to look on Google if my guess has any merit, and something
> which surprised me is that a lot of people suggest running ntpd on
> *both* host and guest. But if running ntpd on the host alone would
> have magically cause the guest's clock to also be accurate, why would
> anyone recommend running ntpd on the guest? So maybe the guest indeed
> misses the ntpd adjustments from the host? I couldn't find anyone
> discussing this. Maybe Glauber remembers something on this.

So now I need to find out what on earth the systemd service is doing -
as clearly its the root cause for my problem. Secondly I need to find
out what I can run on the host to keep the time in sync in a sane way
such that the guest OSv processes do not drift.

I'll check to see what the deployed service is using in terms of time
synchronisation...

Rick

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: OSv time drifting when running under KVM

2018-09-16 Thread Nadav Har'El
On Mon, Sep 17, 2018 at 1:21 AM, Rick Payne  wrote:

> On Fri, 2018-09-14 at 12:12 -0700, Waldek Kozaczuk wrote:
> > Stupid question: are you sure the KVM accellarion is actually enabled
> > and kvm-clock actually used? If not OSv would fall back to hpet which
> > we know we have some problems with.
>
> The processor flags are: "sse3 cmpxchg16b x2apic clflush kvmclock
> kvmclock2 kvm_pv_eoi kvmclock_stable"
>
> I'm pretty sure that means its using the kvmclock, so I'm at a bit of a
> loss to understand why its drifing (and its a second or more an hour,
> so quite significant).
>

I just sent a guess that maybe ntp in the host is to blame, but I don't see
how
this can explain a second each hour. I thought it was just a couple of
seconds
each day...


> This has been running a few hours:
>
> # curl http://192.168.x.x/os/date && TZ=UTC date
> "Sun Sep 16 22:17:39 UTC 2018"
>  Sun Sep 16 22:17:57 UTC 2018
>
> Rick
>
> --
> You received this message because you are subscribed to the Google Groups
> "OSv Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to osv-dev+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: OSv time drifting when running under KVM

2018-09-16 Thread Nadav Har'El
On Thu, Sep 13, 2018 at 10:19 PM, Rick Payne  wrote:

>
> We have a problem with OSv's wall clock drifting away from the
> hypervisor's. For example, this VM has been running for under 24hrs,
> and when I compare the hypervisor time, with that retrieved from the
> httpserver-api, I get this:
>
> $ curl http://192.168.x.x/os/date && TZ=UTC date
> "Thu Sep 13 19:17:25 UTC 2018" Thu Sep 13 19:17:28 UTC 2018
>
> The OSv images are being run under virsh control, using kvm. We've
> tried a few configuration optiosn for the clock on KVM but its not
> helping. Any ideas what we're doing wrong?
>

I have a wild guess, but I'm not a big clock expert, and I'm CCing Glauber
who might have better ideas.

My guess is that you have ntp running in the *host*, but not in the guest
(we don't have an ntp client for OSv), and somehow this is causing this
drift in wall-clock time between the two. My guess (and again, this is a
guess, I don't know if that's true), that the adjtime / adjtimex /
ntp_adjtime or whatever system call that ntp uses to gradually adjust the
time in the host, doesn't cause the same adjustment to be propagated to the
guest by the paravirtual clock mechanism (which probably relies on the
clock frequency being fixed, while adjtime tweaks it a bit).

You can verify this guess by stopping the ntpd/chronyd demon in the host
and seeing if the drift remains or goes away.

I tried to look on Google if my guess has any merit, and something which
surprised me is that a lot of people suggest running ntpd on *both* host
and guest. But if running ntpd on the host alone would have magically cause
the guest's clock to also be accurate, why would anyone recommend running
ntpd on the guest? So maybe the guest indeed misses the ntpd adjustments
from the host? I couldn't find anyone discussing this. Maybe Glauber
remembers something on this.



>
> Cheers,
> Rick
>
> --
> You received this message because you are subscribed to the Google Groups
> "OSv Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to osv-dev+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: OSv time drifting when running under KVM

2018-09-16 Thread Rick Payne
On Fri, 2018-09-14 at 12:12 -0700, Waldek Kozaczuk wrote:
> Stupid question: are you sure the KVM accellarion is actually enabled
> and kvm-clock actually used? If not OSv would fall back to hpet which
> we know we have some problems with.

The processor flags are: "sse3 cmpxchg16b x2apic clflush kvmclock
kvmclock2 kvm_pv_eoi kvmclock_stable"

I'm pretty sure that means its using the kvmclock, so I'm at a bit of a
loss to understand why its drifing (and its a second or more an hour,
so quite significant).

This has been running a few hours:

# curl http://192.168.x.x/os/date && TZ=UTC date
"Sun Sep 16 22:17:39 UTC 2018"
 Sun Sep 16 22:17:57 UTC 2018

Rick

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: OSv time drifting when running under KVM

2018-09-14 Thread Rick Payne
On Fri, 2018-09-14 at 16:25 -0700, Dor Laor wrote:
> Need to look at the guest log and the restapi

The log says:

/usr/bin/qemu-system-x86_64 -name xxx -S -machine pc-i440fx-
xenial,accel=kvm,usb=off -m 8192 -realtime mlock=off -smp
4,sockets=4,cores=1,threads=1 ...

Rick

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: OSv time drifting when running under KVM

2018-09-14 Thread Dor Laor
Need to look at the guest log and the restapi

On Fri, Sep 14, 2018, 15:53 Rick Payne  wrote:

> On Fri, 2018-09-14 at 12:12 -0700, Waldek Kozaczuk wrote:
> > Stupid question: are you sure the KVM accellarion is actually enabled
> > and kvm-clock actually used? If not OSv would fall back to hpet which
> > we know we have some problems with.
>
> My turn for a stupid question - how would I know?
>
> I do 'virsh edit ...' to edit the XML and the first line is:
>
> 
>
> so I assumed its kvm based. I tried to ensure that kvmclock was used by
> changing the timer settings to:
>
>   
> 
>   
>
> Cheers,
> Rick
>
> --
> You received this message because you are subscribed to the Google Groups
> "OSv Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to osv-dev+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: OSv time drifting when running under KVM

2018-09-14 Thread Rick Payne
On Fri, 2018-09-14 at 12:12 -0700, Waldek Kozaczuk wrote:
> Stupid question: are you sure the KVM accellarion is actually enabled
> and kvm-clock actually used? If not OSv would fall back to hpet which
> we know we have some problems with.

My turn for a stupid question - how would I know?

I do 'virsh edit ...' to edit the XML and the first line is:



so I assumed its kvm based. I tried to ensure that kvmclock was used by
changing the timer settings to:

  

  

Cheers,
Rick

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: OSv time drifting when running under KVM

2018-09-14 Thread Waldek Kozaczuk
Stupid question: are you sure the KVM accellarion is actually enabled and 
kvm-clock actually used? If not OSv would fall back to hpet which we know 
we have some problems with.

Waldek

On Thursday, September 13, 2018 at 3:19:30 PM UTC-4, rickp wrote:
>
>
> We have a problem with OSv's wall clock drifting away from the 
> hypervisor's. For example, this VM has been running for under 24hrs, 
> and when I compare the hypervisor time, with that retrieved from the 
> httpserver-api, I get this: 
>
> $ curl http://192.168.x.x/os/date && TZ=UTC date 
> "Thu Sep 13 19:17:25 UTC 2018" Thu Sep 13 19:17:28 UTC 2018 
>
> The OSv images are being run under virsh control, using kvm. We've 
> tried a few configuration optiosn for the clock on KVM but its not 
> helping. Any ideas what we're doing wrong? 
>
> Cheers, 
> Rick 
>
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.