On Sun, Oct 20, 2019 at 07:08:27AM +0000, Toby Betts wrote:
> Otto Moerbeek wrote:
> > So it looks like the hyperv0 sensor is giving wrong information
> > occasionally.
> >
> > As a workaround, you can try to disable the sensor line in ntpd.conf.
> > After that, set the time to something a bit behind the real time and
> > reboot. That should give you proper time (in automatic mode, ntpd will
> > only jump the clock forward).
>
> This occurred on a VM that I was preparing to run in production with
> Ansible, and part of that preparation involves a reboot after
> disabling smtpd. The workaround I've implemented is to put "rcctl
> disable ntpd" into my install.site script in siteXX.tgz for future
> deployments. If this VM were important, rdate may also help repair the
> system clock.
>
> > I like your suggestion to also subject sensors to constrainst, I'll put
> > in on my todo list.
> >
> > Having said that, a sensor providing rogue time is bad, so can you
> > tell more about yout VM host? I might try to reproduce (and fix) the
> > sensor bug before doing the constraint thing.
>
> The Hyper-V host is a year-old Windows 10 Pro workstation running
> build 10.0.17763.805. It had only been up for about 8-9 days when this
> issue manifested on one of the VMs. It's the first time I've ever seen
> something like this happen to a VM's clock.
>
> I noticed that overall VM performance was fairly poor that day -- I
> run Ansible from a VM and the playbook I use to configure OpenBSD VMs
> was taking 4 hours to complete -- and the CPU utilization was
> consistently high, so I restarted the host today and the playbook
> finished in about 2 hours 45 minutes, so high load on the Hyper-V host
> may be a factor. I paused this affected VM before restarting the host
> in case it warrants further investigation.
>
>
> Toby
>
Thanks. I think trying to reproduce this bug is probably wasted
effort, so I concentrated on the validation question. See diff.
Note that the variable constraint_cnt does not mean what you might
think it means.
-Otto
Index: ntp.c
===================================================================
RCS file: /cvs/src/usr.sbin/ntpd/ntp.c,v
retrieving revision 1.159
diff -u -p -r1.159 ntp.c
--- ntp.c 16 Jul 2019 14:15:40 -0000 1.159
+++ ntp.c 21 Oct 2019 04:40:15 -0000
@@ -246,7 +246,8 @@ ntp_main(struct ntpd_conf *nconf, struct
idx_peers = i;
sent_cnt = trial_cnt = 0;
TAILQ_FOREACH(p, &conf->ntp_peers, entry) {
- if (constraint_cnt && conf->constraint_median == 0)
+ if (!TAILQ_EMPTY(&conf->constraints) &&
+ conf->constraint_median == 0)
continue;
if (p->next > 0 && p->next <= getmonotime()) {
@@ -298,7 +299,9 @@ ntp_main(struct ntpd_conf *nconf, struct
}
idx_clients = i;
- if (!TAILQ_EMPTY(&conf->ntp_conf_sensors)) {
+ if (!TAILQ_EMPTY(&conf->ntp_conf_sensors) &&
+ (TAILQ_EMPTY(&conf->constraints) ||
+ conf->constraint_median != 0)) {
if (last_sensor_scan == 0 ||
last_sensor_scan + SENSOR_SCAN_INTERVAL <=
getmonotime()) {
sensors_cnt = sensor_scan();
Index: sensors.c
===================================================================
RCS file: /cvs/src/usr.sbin/ntpd/sensors.c,v
retrieving revision 1.52
diff -u -p -r1.52 sensors.c
--- sensors.c 3 Sep 2016 11:52:06 -0000 1.52
+++ sensors.c 21 Oct 2019 04:40:15 -0000
@@ -165,6 +165,7 @@ sensor_query(struct ntp_sensor *s)
{
char dxname[MAXDEVNAMLEN];
struct sensor sensor;
+ double sens_time;
if (conf->settime)
s->next = getmonotime() + SENSOR_QUERY_INTERVAL_SETTIME;
@@ -193,6 +194,19 @@ sensor_query(struct ntp_sensor *s)
return;
s->last = sensor.tv.tv_sec;
+
+
+ if (!TAILQ_EMPTY(&conf->constraints)) {
+ if (conf->constraint_median == 0) {
+ return;
+ }
+ sens_time = gettime() + (sensor.value / -1e9) +
+ (s->correction / 1e6);
+ if (constraint_check(sens_time) != 0) {
+ log_info("sensor %s: constraint check failed",
s->device);
+ return;
+ }
+ }
/*
* TD = device time
* TS = system time