Hello,
I have an old Sun Ultra 10 with a dead motherboard battery. After
cold-starting the machine the hardware clock now always indicates the
date as being January 1 1968. Strange things then happen when I boot
OpenBSD (10.10.6.10 and 10.10.6.11 are my local time servers):
============================================================
[...]
Power On Self Test Failed. Cause: NVRAM U13
ok date
01/01/1968 00:00:24 GMT
ok boot disk
[...]
OpenBSD 4.7-current (GENERIC) #328: Mon May 24 08:54:38 MDT 2010
[email protected]:/usr/src/sys/arch/sparc64/compile/GENERIC
[...]
root on wd0a swap on wd0b dump on wd0b
WARNING: clock lost 21038 days -- CHECK AND RESET THE DATE!
Automatic boot in progress: starting file system checks.
/dev/rwd0a: file system is clean; not checking
[...]
starting named
Nov 26 00:33:09 plara named[24368]:
/usr/src/usr.sbin/bind/lib/isc/entropy.c:382: fatal error:
Nov 26 00:33:09 plara named[24368]: RUNTIME_CHECK(isc_time_now((&t)) == 0)
failed
Nov 26 00:33:09 plara named[24368]: exiting (due to fatal error in library)
starting initial daemons: ntpd.
Nov 26 00:33:10 plara ntpd[15149]: recvmsg control format 10.10.6.10: No such
file or directory
Nov 26 00:33:10 plara ntpd[15149]: recvmsg control format 10.10.6.11: No such
file or directory
[...]
standard daemons: apmd cron.
Thu Nov 26 00:33:31 ICT 1931
[...]
============================================================
The warning to check the date is clear enough, though I was still a
bit surprised to see both named and ntpd fail. I don't know why named
cares so much about the date but I'll assume there's a good reason. I
also don't know why or how OpenBSD transforms 01/01/1968 into November
26 1931, but then again, once the battery is dead I guess the hardware
can be considered broken and all bets are off.
Note that the initial warning indicates that the clock lost 21038
days, which is about 57.6 years. 2010.4 (~ today) + 57.6 = 2068 =
1968 + 100, so it looks like the computation is done modulo 100. I
would have expected to see a value like (2010.4 - 1931.8)*365 = 28689
days lost.
Anyway, what I'm mostly surprised about is the behavior of ntpd, since
I expected it to correct the machine's date instead of failing:
# egrep ntpd /etc/rc.conf.local
ntpd_flags="-s"
The error message from ntpd is also strange. Note that ntpd does not
die, it just seems to hang around doing nothing.
I have to manually move the date to at least 01/01/1970 for 'ntpd -s'
to work:
============================================================
# date
Thu Nov 26 00:40:10 ICT 1931
# ntpd -d -s
ntp engine ready
recvmsg control format 10.10.6.10: No such file or directory
recvmsg control format 10.10.6.11: No such file or directory
no reply received in time, skipping initial time setting
^Cntp engine exiting
Terminating
# date -u 197001010000.00
Thu Jan 1 00:00:00 UTC 1970
# ntpd -d -s
ntp engine ready
reply from 10.10.6.10: offset 1274851922.958401 delay 0.001080, next query 6s
set local clock to Wed May 26 12:32:07 ICT 2010 (offset 1274851922.958401s)
reply from 10.10.6.11: offset 1274851922.956023 delay 0.001321, next query 7s
reply from 10.10.6.10: offset 0.000074 delay 0.000816, next query 9s
reply from 10.10.6.11: offset -0.002262 delay 0.001241, next query 8s
reply from 10.10.6.10: offset 0.000024 delay 0.000715, next query 9s
reply from 10.10.6.11: offset -0.002189 delay 0.001413, next query 8s
peer 10.10.6.11 now valid
reply from 10.10.6.11: offset -0.002417 delay 0.000954, next query 9s
peer 10.10.6.10 now valid
reply from 10.10.6.10: offset -0.000009 delay 0.000691, next query 5s
============================================================
I found code in /usr/src/usr.sbin/ntpd/client.c which I think explains
why I have to manually move the date to at least 01/01/1970:
if (T4 < JAN_1970) {
client_log_error(p, "recvmsg control format", EBADF);
set_next(p, error_interval());
return (0);
}
though I guess this test ends up covering more cases than it was
supposed to, given the error message...
I tried rdate too, without success:
============================================================
# date
Thu Nov 26 00:40:13 ICT 1931
# rdate -nv 10.10.6.11
Thu Nov 26 00:40:15 ICT 1931
rdate: adjust local clock by 9223362813482738688.000000 seconds
# date
Thu Nov 26 00:40:16 ICT 1931
============================================================
60s/mn * 60mn/hour * 24hour/day * 365day/year = 31536000s/year
Then 9223362813482738688s/(31536000s/year) = 292470916206 years...
Well, I'll just have to put something like 'date -u 197001010000.00'
directly into /etc/rc to help ntpd a bit. Note that I'm not
complaining, the root problem is obviously the dead battery. I was
just a bit surprised that I couldn't use ntpd or rdate to easily fix
it...
On a related note, date(1) man's page seems to be wrong. The
'1969-2068' range for years looks more like '1970-2038':
============================================================
# date -u 196912312359.59
date: specified date is outside allowed range
# date -u 197001010000.00
Thu Jan 1 00:00:00 UTC 1970
# date -u 206812312359.59
date: specified date is outside allowed range
# date -u 206801010000.00
date: specified date is outside allowed range
# date -u 206701010000.00
date: specified date is outside allowed range
# date -u 206601010000.00
date: specified date is outside allowed range
[...]
# date -u 203701010000.00
Thu Jan 1 00:00:00 UTC 2037
# date -u 203801010000.00
denied attempt to set clock forward to 2145916800
date: settimeofday: Operation not permitted
# date -u 203901010000.00
date: specified date is outside allowed range
============================================================
It's also a bit strange to get two different kinds of error messages.
I only tested it on the sparc64 machine though...
Philippe