Kasper,

> A short run here using the latest kernel gives an error of < ± 2e-10
> on 14.31818MHz hpet, and 0.5ppm on tsc.

Ah, we know that frequency well

14.31818 MHz is 4x the NTSC colorburst frequency (~3.58 MHz). It is also 3x the original PC CPU clock frequency (~4.77 MHz). And 12x an ISA bus frequency (~1.19318 MHz). The IBM PC had clock division by 3 and 4 to generate those frequencies. Even the laptop I'm using to type this has a high-res timer resolution of 3.579545 MHz. [1] These magic numbers are everywhere.

Time for a short story. About 30 years ago I was involved with the development of the Windows NT kernel on a 64-bit machine. NT timekeeping is pretty much like any OS timekeeping, with ticks and counters and epochs and stuff. One of the machines had a HAL clock rate of 1.193181 MHz. So how do you set an accurate periodic interrupt timer for that weird number?

One way is with divisions and shifts and approximations. And I see that in your Linux code. The guiding principle is usually "close enough" is good enough. One semi-valid excuse is "the crystal isn't accuracy enough anyway". Another valid excuse is "NTP will take care of it", so don't worry.

This attitude is enough to make any time nut cringe. It institutionalizes small errors, excuses rounding and truncation issues. It's "time abuse" at the nanosecond level. And it means if you do happen to have a perfect external clock, the OS still cannot keep correct time. As you and OP now know well.

Anyway, the solution for the NTSC-based HAL was to use pure integers. No shifts, no rounding, no truncation, ever. Instead modulus arithmetic was used. You know, the N/M or N/M+R stuff that you see in a phase correct DDS.

Part of the trick was to realize that the 1.19 MHz number was 1/3rd of the NTSC colorburst frequency. At that is not 3.579 MHz, nor 3.579545 MHz, but instead 315 / 88 MHz. The math (and history) behind NTSC is really cool. Fun for math- and time-nuts. [2] It always boils down to integers, vacuum tube friendly small prime factors like 2, 3, 5, 7, 11, and 13.

If you do your kernel timekeeping in integers and modulus arithmetic you are essentially doing cycle counting and the kernel will keep perfect time relative to the external oscillator. So that should be the goal. Not e-6, not e-9, not e-10, but perfect cycle counting. Consider this a strong plea for someone in both BSD- and Linux- land to pull that off.

Ironically the first UNIX I worked on was a PDP-11 and it had "cycle accurate" timekeeping. It was based on a 60 Hz mains interrupt and the kernel code to increment time used modulus arithmetic. The code was essentially: if(++lbolt >= HZ) { ... lbolt =- HZ; ... } where HZ is 60.

/tvb

[1] C:\tvb\> qpc

frequency: 0x00000000_00369E99          3579545 3.579545000000000e+006          3.579545000 MHz   counter: 0x00000091_338B6739     623635031865 6.236350318650000e+011     174221.872295222 s

[2] How to become a bit banging, cycle counting, PIC loving, embedded programming, time nut.


On 1/6/2021 10:39 AM, Kasper Pedersen wrote:
On 06.01.2021 06.35, Luiz Paulo Damaceno wrote:
Hi all,

I'm studying computer's timekeeping and i'm on level of remove the base
crystal that feeds the entire PLL logic of the motherboard (24 MHz on
motherboard that i'm using) and compare system's time with an NTP server.

(After reading your posts, and your plot, and guessing at PCish
motherboard and Linux for the board under test)

I did the same thing many moons ago now, and back then it was much
worse. It turned out to be a painful rounding error in the kernel:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a386b5af8edda1c742ce9f77891e112eefffc005

After that quick-fix there was 5e-9 left, about what your plot shows.
After later cleanup that should now be be less than 3e-10.


What clocksource are you using? see

/sys/devices/system/clocksource/clocksource0/current_clocksource
and
/sys/devices/system/clocksource/clocksource0/available_clocksource

You should pick the one that has the least unpredictable
(rounding-error-prone in hardware) PLL ratios between the crystal
oscillator and the clocksource. On most PCs that is hpet, which is
typically driven directly from the crystal, though in your case this may
be a synthesized 14.31818 MHz (the most common hpet frequency).


A short run here using the latest kernel gives an error of < +/- 2e-10
on 14.31818MHz hpet, and 0.5ppm on tsc.


/Kasper Pedersen

_______________________________________________
time-nuts mailing list -- [email protected]
To unsubscribe, go to 
http://lists.febo.com/mailman/listinfo/time-nuts_lists.febo.com
and follow the instructions there.



_______________________________________________
time-nuts mailing list -- [email protected]
To unsubscribe, go to 
http://lists.febo.com/mailman/listinfo/time-nuts_lists.febo.com
and follow the instructions there.

Reply via email to