One more update .. when I look at `ntpctl -sa` now, it does not show any
"peer not valid" errors. However, it still rons a good 14 seconds behind.
And it gets worse every minute.
# ntpctl -sa
4/4 peers valid, 1/1 sensors valid, constraint offset -115s (4 errors),
clock unsynced
peer
wt tl st next poll offset delay jitter
213.154.236.182 from pool pool.ntp.org
1 10 2 3080s 3153s 2298.229ms 3.482ms 1.359ms
83.98.201.134 from pool pool.ntp.org
1 10 2 3154s 3220s 2764.077ms 2.686ms 0.703ms
217.23.3.234 from pool pool.ntp.org
1 10 2 2952s 3020s 2682.053ms 2.880ms 0.528ms
185.92.220.131 from pool pool.ntp.org
1 10 2 2999s 3076s 2266.144ms 2.287ms 0.937ms
sensor
wt gd st next poll offset correction
vmmci0
1 1 0 6s 15s 14607.577ms 0.000ms
On Fri, Nov 9, 2018 at 7:18 PM Stefan Arentz <[email protected]>
wrote:
> Here is an update on the situation:
>
> I installed -current on this VM, clean install, and the ntpd error does
> not happen anymore. But the clock issues remain, even with ntpd running.
>
>
> The ntpd starts without complaints now, and seems to be running with its
> regular processes:
>
> _ntp 70093 0.0 0.5 920 2540 ?? S<sp 7:04PM 0:00.02 ntpd:
> ntp engine (ntpd)
> _ntp 51912 0.0 0.5 736 2464 ?? Isp 7:04PM 0:00.01 ntpd:
> dns engine (ntpd)
> root 46674 0.0 0.3 792 1640 ?? S<sp 7:04PM 0:00.00
> /usr/sbin/ntpd -s
>
>
> I have set kern.timecounter.hardware to tsc:
>
> # systctl kern.timecounter
> kern.timecounter.tick=1
> kern.timecounter.timestepwarnings=0
> kern.timecounter.hardware=tsc
> kern.timecounter.choice=i8254(0) tsc(-1000) dummy(-1000000)
>
>
> trondd asked for the output of ntpctl -sa, which shows me the following:
>
> # ntpctl -sa
> 0/4 peers valid, 1/1 sensors valid, constraint offset -3s, clock unsynced,
> clock offset is 12771.710ms
>
> peer
> wt tl st next poll offset delay jitter
> 213.154.236.182 from pool pool.ntp.org
> 1 2 - 152s 300s ---- peer not valid ----
> 83.98.201.134 from pool pool.ntp.org
> 1 2 - 152s 300s ---- peer not valid ----
> 217.23.3.234 from pool pool.ntp.org
> 1 2 - 152s 300s ---- peer not valid ----
> 185.92.220.131 from pool pool.ntp.org
> 1 2 - 152s 300s ---- peer not valid ----
>
> sensor
> wt gd st next poll offset correction
> vmmci0
> 1 1 0 15s 15s 18001.915ms 0.000ms
>
>
> I am not sure how to interpret these numbers. I also don't understand the
> "peer not valid" messages here. I have another OpenBSD VM which has the
> exact same ntpd.conf and it does not complain about any of the peers.
>
>
> I think my conclusion is that this is not something that can be solved at
> the VM level.
>
> S.
>
>
> On Sat, Nov 3, 2018 at 8:10 PM Stefan Arentz <[email protected]>
> wrote:
>
>> Hi everyone,
>>
>> I am having an issue where an OpenBSD VM running on vmd is having
>> serious clock skew issues.
>>
>> I am relatively new to OpenBSD, so I am not sure how to properly debug
>> this. What I hope is that I can provide a good amount of data and folks
>> here can give me some hints and ask me for additional information to
>> get to the root cause of this.
>>
>> So first some facts and symptoms:
>>
>> - Both Host and Guest are running OpenBSD 6.4. The host runs GENERIC.MP
>> and the guest GENERIC.
>> - The host runs 50 guests, all OpenBSD (openbsd.amsterdam)
>> - Only this VM is having this clock issue (is this correct, or were
>> there others?)
>>
>> - The guest has kern.timecounter.hardware=tsc
>> - The time on the VM was set with rdate a couple of days ago, and as of
>> now the VM is running about 4 hours behind.
>> - ntpd is running (main process, dns engine, ntp engine)
>> - when started or restarted, ntpd complains about "pipe write error
>> (from main): No such file or directory" but does seem to start
>>
>> - I just ran rdate nl.pool.ntp.org and the date was properly updated
>> - One minute after running rdate, the clock is already 7 seconds slow
>>
>> - The guest also has some severe networking issues. often I cannot type
>> more than a few characters before a ~15 second delays happens.
>> Interactive typing is difficult.
>> - I can SSH into the Host and have none of these issues, ruling out
>> connectivity issues between me (Toronto) and the Host (Amsterdam)
>>
>> It would be easy to blame this on NTPd, which does have an unexplained
>> error message. However, I think even without running NTPd, the clock
>> skew should not be this extreme.
>>
>> Somehow I have a gut feeling that the clock issues and the networking
>> issues are related.
>>
>> I am root on the VM but I am not on the host. I do have vmctl access.
>> However, the host admin is friendly (Hi Mischa) and is happy to help to
>> debug this issue.
>>
>> I tried to ktrace ntpd to get more insight in the "pipe write error
>> (from main): No such file or directory" error but I did not get useful
>> info out of it. This may be because of my unfamiliarity with those
>> tools.
>>
>> Help appreciated :-)
>>
>> S.
>>
>>