On Mon, Aug 05, 2069 at 09:33:05AM +0000, Toby Betts wrote:
> >Synopsis: ntpd(8) adds time since epoch to system clock
> >Category: system
> >Environment:
> System : OpenBSD 6.6
> Details : OpenBSD 6.6 (GENERIC.MP) #372: Sat Oct 12 10:56:27 MDT
> 2019
>
> [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
>
> Architecture: OpenBSD.amd64
> Machine : amd64
> >Description:
> After rebooting a VM, ntpd(8) adjusted the system clock to 2069-08-04.
> The reboot command was given at approximately 2019-10-18 23:34:49 UTC,
> which is about 1571441689 seconds since the UNIX epoch, 1970-01-01.
> According to /var/log/daemon, stepping the clock from 2019-10-18 to
> 2069-08-04 is 1571441749.182430 seconds, which is approximately the
> same number of seconds as the system time on the VM when ntpd(8) was
> started again after the reboot.
>
> In other words when ntpd(8) started, it set the system clock to double the
> current time in seconds since 1970.
>
> Output of /var/log/daemon:
>
> Oct 18 22:53:06 obsd ntpd[96201]: creating new /var/db/ntpd.drift
> Oct 18 22:53:06 obsd ntpd[3505]: ntp engine ready
> Oct 18 22:53:08 obsd ntpd[96201]: set local clock to Fri Oct 18 22:53:08 UTC
> 2019 (offset 1.652045s)
> Oct 18 22:53:08 obsd savecore: no core dump
> Oct 18 22:53:09 obsd ntpd[3505]: constraint reply from 172.217.3.196: offset
> -0.075150
> Oct 18 22:53:28 obsd ntpd[3505]: peer 47.190.36.230 now valid
> Oct 18 22:53:29 obsd ntpd[3505]: peer 3.15.245.6 now valid
> Oct 18 22:53:30 obsd ntpd[3505]: peer 162.159.200.1 now valid
> Oct 18 22:53:30 obsd ntpd[3505]: peer 184.105.182.16 now valid
> Oct 18 22:53:31 obsd ntpd[3505]: peer 198.50.238.163 now valid
> Oct 18 22:58:41 obsd ntpd[3505]: clock is now synced
> Oct 18 22:58:41 obsd ntpd[3505]: constraint reply from 172.217.3.196: offset
> -0.924238
> Oct 18 23:03:44 obsd ntpd[3505]: peer 3.15.245.6 now invalid
> Oct 18 23:04:04 obsd ntpd[3505]: peer 104.131.139.195 now valid
> Oct 18 23:19:12 obsd ntpd[39106]: adjusting clock frequency by -20.378477 to
> -20.378477ppm
>
> [ reboot issued approximately here ]
>
> Oct 18 23:35:12 obsd ntpd[39106]: adjusting clock frequency by -0.737827 to
> -21.116304ppm
> Oct 18 23:35:53 obsd ntpd[35750]: ntp engine ready
> Aug 4 23:11:42 obsd ntpd[79447]: set local clock to Sun Aug 4 23:11:42 UTC
> 2069 (offset 1571441749.182430s)
> Aug 4 23:11:42 obsd savecore: no core dump
> Aug 4 23:11:43 obsd ntpd[35750]: constraint reply from 172.217.3.196: offset
> -1571441748.432712
> Aug 4 23:12:02 obsd ntpd[35750]: peer 45.79.36.123 now valid
> Aug 4 23:12:03 obsd ntpd[35750]: peer 162.159.200.1 now valid
> Aug 4 23:12:03 obsd ntpd[35750]: peer 162.159.200.1 now valid
> Aug 4 23:12:04 obsd ntpd[35750]: peer 103.105.51.156 now valid
> Aug 4 23:12:08 obsd ntpd[35750]: peer 199.101.100.221 now valid
> Aug 4 23:13:02 obsd ntpd[23365]: adjusting local clock by -1571441747.612037s
> Aug 4 23:16:16 obsd ntpd[23365]: adjusting local clock by -1571441746.642918s
> Aug 4 23:17:19 obsd ntpd[23365]: adjusting local clock by -1571441746.326408s
> Aug 4 23:21:43 obsd ntpd[23365]: adjusting local clock by -1571441744.999186s
> Aug 4 23:22:14 obsd ntpd[23365]: adjusting local clock by -1571441744.843526s
> Aug 4 23:26:29 obsd ntpd[23365]: adjusting local clock by -1571441743.563666s
> Aug 4 23:27:35 obsd ntpd[23365]: adjusting local clock by -1571441743.233665s
> Aug 4 23:28:05 obsd ntpd[23365]: adjusting local clock by -1571441743.082044s
> Aug 4 23:32:22 obsd ntpd[23365]: adjusting local clock by -1571441741.788132s
> Aug 4 23:33:23 obsd ntpd[35750]: peer 45.79.36.123 now invalid
> Aug 4 23:33:46 obsd ntpd[35750]: peer 72.87.88.202 now valid
> Aug 4 23:34:43 obsd ntpd[23365]: adjusting local clock by -1571441741.077493s
> Aug 4 23:37:26 obsd ntpd[23365]: adjusting local clock by -1571441740.260640s
> Aug 4 23:40:41 obsd ntpd[23365]: adjusting local clock by -1571441739.281630s
> Aug 4 23:42:49 obsd ntpd[23365]: adjusting local clock by -1571441738.634681s
> Aug 4 23:43:55 obsd ntpd[23365]: adjusting local clock by -1571441738.304680s
> Aug 4 23:44:28 obsd ntpd[23365]: adjusting local clock by -1571441738.139680s
> Aug 4 23:45:35 obsd ntpd[23365]: adjusting local clock by -1571441737.800824s
> Aug 4 23:47:07 obsd ntpd[23365]: adjusting local clock by -1571441737.340824s
> Aug 4 23:47:41 obsd ntpd[23365]: adjusting local clock by -1571441737.170823s
> Aug 4 23:49:20 obsd ntpd[23365]: adjusting local clock by -1571441736.675189s
> Aug 4 23:52:29 obsd ntpd[23365]: adjusting local clock by -1571441735.717282s
> Aug 4 23:56:39 obsd ntpd[23365]: adjusting local clock by -1571441734.467036s
> Aug 4 23:58:16 obsd ntpd[23365]: adjusting local clock by -1571441733.977244s
> Aug 5 00:01:30 obsd ntpd[23365]: adjusting local clock by -1571441733.007243s
> Aug 5 00:02:36 obsd ntpd[23365]: adjusting local clock by -1571441732.672212s
> Aug 5 00:04:44 obsd ntpd[23365]: adjusting local clock by -1571441732.028314s
> Aug 5 00:09:05 obsd ntpd[23365]: adjusting local clock by -1571441730.712893s
> Aug 5 00:13:12 obsd ntpd[23365]: adjusting local clock by -1571441729.477397s
> Aug 5 00:17:28 obsd ntpd[23365]: adjusting local clock by -1571441728.188918s
> Aug 5 00:19:07 obsd ntpd[23365]: adjusting local clock by -1571441727.690377s
> Aug 5 00:23:20 obsd ntpd[23365]: adjusting local clock by -1571441726.418376s
> Aug 5 00:24:55 obsd ntpd[23365]: adjusting local clock by -1571441725.942814s
> Aug 5 00:28:13 obsd ntpd[23365]: adjusting local clock by -1571441724.947839s
> Aug 5 00:32:30 obsd ntpd[23365]: adjusting local clock by -1571441723.656048s
> Aug 5 00:33:02 obsd ntpd[23365]: adjusting local clock by -1571441723.496047s
> Aug 5 00:36:13 obsd ntpd[23365]: adjusting local clock by -1571441722.533781s
> Aug 5 00:39:27 obsd ntpd[23365]: adjusting local clock by -1571441721.560180s
> Aug 5 00:43:11 obsd ntpd[23365]: adjusting local clock by -1571441720.435645s
> Aug 5 00:44:13 obsd ntpd[23365]: adjusting local clock by -1571441720.121702s
> Aug 5 00:47:19 obsd ntpd[23365]: adjusting local clock by -1571441719.183850s
> Aug 5 00:49:24 obsd ntpd[23365]: adjusting local clock by -1571441718.558849s
> Aug 5 00:51:26 obsd ntpd[23365]: adjusting local clock by -1571441717.948692s
> Aug 5 00:54:07 obsd ntpd[23365]: adjusting local clock by -1571441717.139054s
> Aug 5 00:55:43 obsd ntpd[23365]: adjusting local clock by -1571441716.659053s
> Aug 5 00:58:55 obsd ntpd[23365]: adjusting local clock by -1571441715.689568s
> Aug 5 00:59:58 obsd ntpd[23365]: adjusting local clock by -1571441715.374567s
> Aug 5 01:03:41 obsd ntpd[23365]: adjusting local clock by -1571441714.252451s
> Aug 5 01:04:47 obsd ntpd[23365]: adjusting local clock by -1571441713.922451s
> Aug 5 01:08:33 obsd ntpd[23365]: adjusting local clock by -1571441712.787129s
> Aug 5 01:10:12 obsd ntpd[23365]: adjusting local clock by -1571441712.287127s
> Aug 5 01:14:02 obsd ntpd[23365]: adjusting local clock by -1571441711.131053s
> Aug 5 01:16:38 obsd ntpd[23365]: adjusting local clock by -1571441710.345689s
> Aug 5 01:19:17 obsd ntpd[23365]: adjusting local clock by -1571441709.544722s
> Aug 5 01:21:24 obsd ntpd[23365]: adjusting local clock by -1571441708.908260s
> Aug 5 01:23:02 obsd ntpd[23365]: adjusting local clock by -1571441708.415668s
> Aug 5 01:26:16 obsd ntpd[23365]: adjusting local clock by -1571441707.442792s
> Aug 5 01:27:21 obsd ntpd[23365]: adjusting local clock by -1571441707.113015s
> Aug 5 01:27:53 obsd ntpd[23365]: adjusting local clock by -1571441706.952315s
> Aug 5 01:28:24 obsd ntpd[23365]: adjusting local clock by -1571441706.792990s
> Aug 5 01:31:31 obsd ntpd[23365]: adjusting local clock by -1571441705.857989s
> Aug 5 01:35:53 obsd ntpd[23365]: adjusting local clock by -1571441704.541153s
> Aug 5 01:39:08 obsd ntpd[23365]: adjusting local clock by -1571441703.562222s
> Aug 5 01:42:56 obsd ntpd[23365]: adjusting local clock by -1571441702.415392s
> Aug 5 01:43:43 obsd ntpd[35750]: peer 72.87.88.202 now invalid
> Aug 5 01:44:06 obsd ntpd[35750]: peer 198.50.238.163 now valid
> Aug 5 01:46:07 obsd ntpd[23365]: adjusting local clock by -1571441701.452365s
> Aug 5 01:50:28 obsd ntpd[23365]: adjusting local clock by -1571441700.143086s
> Aug 5 01:51:32 obsd ntpd[23365]: adjusting local clock by -1571441699.821617s
> Aug 5 01:54:15 obsd ntpd[23365]: adjusting local clock by -1571441699.000602s
> Aug 5 01:57:24 obsd ntpd[35750]: peer 103.105.51.156 now invalid
> Aug 5 01:57:44 obsd ntpd[35750]: peer 142.147.92.5 now valid
> Aug 5 01:58:34 obsd ntpd[23365]: adjusting local clock by -1571441697.699633s
> Aug 5 02:01:18 obsd ntpd[23365]: adjusting local clock by -1571441696.874084s
>
> [ ntpd continues trying to adjust the time by fractions of a second ]
>
> The value of /etc/ntpd.conf was not changed from the default:
>
> # $OpenBSD: ntpd.conf,v 1.15 2019/07/04 05:19:31 deraadt Exp $
> #
> # See ntpd.conf(5) and /etc/examples/ntpd.conf
>
> servers pool.ntp.org
> server time.cloudflare.com
> sensor *
> constraints from "https://www.google.com"
>
> Since this is a VM, it also queries hyperv0 for timesync information,
> but I do not know if the hyperv0 sensor information is sanity checked
> against the default Google HTTPS constraint.
I strongly suspect a bug in either the hypervisor or our hyperv(4)
TIMESYNC code.
ntpd(8) trusts the sensor implicitly. It can't "sanity check" it.
If the sensor is wildly incorrect then ntpd(8) will do something
similarly incorrect: monkey see, monkey do. Any sort of heuristic
logic in userspace adds complexity which inevitably leads to more
bugs. If you can't trust the (paravirtual) hardware then you are
screwed.
> >How-To-Repeat:
> Having repeatedly reinstalled this VM from an autoinstall(8)
> config script a dozen times over the last day trying to get a
> working 6.6 VM, this has only happened once.
Sounds like a race.
Perhaps the hypervic(4) TIMESYNC operation and inittodr(9) do not
always run in the same order.
The system UTC clock starts at zero and jumps forward to a
"reasonable" value when inittodr(9) is called when the root filesystem
is first mounted. In inittodr(9) we read the RTC and select a UTC
time to jump to.
But, if I'm reading the dmesg correctly, the root filesystem is
mounted *after* hyperv(4) is attached. Can a hyperv(4) TIMESYNC
operation happen before then?
Let me spin some yarn:
Perhaps this code here in pv/hypervic.c is *sometimes* running
before inittodr(9):
477 case VMBUS_ICMSG_TYPE_TIMESYNC:
478 msg = (struct vmbus_icmsg_timesync *)hdr;
479 if (msg->ic_tsflags == VMBUS_ICMSG_TS_FLAG_SAMPLE) {
480 microtime(&sc->sc_sensor.tv);
481 nanotime(&guest);
482 tns = (msg->ic_hvtime - 116444736000000000LL) *
100;
483 host.tv_sec = tns / 1000000000LL;
484 host.tv_nsec = tns % 1000000000LL;
485 timespecsub(&guest, &host, &diff);
486 sc->sc_sensor.value = (int64_t)diff.tv_sec *
487 1000000000LL + diff.tv_nsec;
488 sc->sc_sensor.status = SENSOR_S_OK;
489 }
490 break;
What would happen in this case?
We call nanotime(9). Because we haven't done inittodr(9) yet it would
return a value close to zero.
Because the hypervisor is a black box let's assume it's not buggy. In
that case, ic_hvtime is very close to the correct UTC time. We
convert this value from a WIN32 FILETIME to a UNIX time and pack that
value into host. Subtracting host from guest then yields a value that
is roughly the negation of the correct UTC time. We convert this
value to nanoseconds and stuff it into sc_sensor.value.
Then, later in boot, we mount the root filesystem and run inittodr(9).
This jumps the UTC clock from zero to the RTC time. Assume this is
also pretty close to the correct UTC time.
Then we bring up userspace. We haven't had another hyperv(4) TIMESYNC
since that first one, so the sensor value is still a huge negative
number.
ntpd(8) is started. It reads the hyperv(4) sensor, sees that huge
negative number, and jumps the UTC clock forward by that amount to
compensate.
Then we get our constraints from the upper stratum NTP servers. They
tell us our time is way too far into the future. We start trying to
nudge the UTC clock backwards with adjtime(2).
--
So, can we enable some extra debug output and log TIMESYNC operations
to see what we're putting in the sensor? We can then compare the
timestamps between /var/log/messages and /var/log/daemon to see if
there's a race between TIMESYNC, inittodr(9), and ntpd(8).
I don't know anything about hyperv(4), though. Take this all with a
grain of salt until a proper expert weighs in here.
-Scott