Reviewed by: George Wilson <george.wil...@delphix.com>
Reviewed by: Sebastien Roy <sebastien....@delphix.com>

Calibration of the APIC timer is currently performed by doing one mesurement at 
boot time.  On some hypervisors, like Azure, the calibration can be quite off 
because timers are emulated.  By taking the median of 5 mesurements instead, we 
can significantly reduce the worst case calibration error and slightly improve 
average calibration error, while not having to introduce any invasive code 
changes.

The APIC is used as a timer in Illumos. Specifically, it is used by the callout 
and cyclic frameworks to generate an interrupt around the time that the closest 
timer would expire. Once in the interrupt context those frameworks call 
gethrtime() to determine which timers have expired, thus the system doesn't 
solely rely on the accuracy of the APIC.

If the APIC is lagging behind the real time then we will have more jitter and 
shorter timeouts will tend to be late.  If the APIC is quicker than it should 
then we will generate an excessive amount of interrupts as the APIC would fire 
an interrupt before any timers expire.  In any case, I've tested what happens 
if the APIC is severely miscalibrated (10% or 1000% of target speed) and it 
doesn't seem to create any unstability on the system.

With 1000% of the speed: we'd see a significant increase of the number of 
interrupts fired, especially when system is idle:

CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys dt idl
0 41 0 5 9711 247 343 6 20 3 0 527 1 3 0 96
1 79 0 14 9366 409 1046 8 20 4 0 2894 1 3 0 96
vs, normally:

CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys dt idl
0 120 0 10 797 254 1082 9 20 3 0 2564 1 2 0 97
1 80 0 11 830 387 385 7 19 4 0 1175 1 1 0 98

The way that the APIC is calibrated is by using the 8254 fixed frequency timer 
(PIT). We wait for it to count a certain amount of ticks and then we check how 
many ticks does the APIC count in the same time interval. The main issue is 
that on some hypervisors, notably hyperv, both the 8254 and the APIC are 
emulated and thus can sometimes be inconsistent.

I've done an experiment to measure how much of an effect do those 
inconsistencies have on the apic calibration factor (which determines how many 
apic ticks pass in a certain amount of nanoseconds), and here are the results 
for about 15000 measurements (done by performing 1000 measurements at a time on 
each boot).

The main observation is that calibration doesn't seem to change from boot to 
boot and that the accuracy of measurements doesn't seem to have any correlation 
to the given time of measurement, which means that very inaccurate measurements 
happen randomly. Most measurements are quite accurate, except for some rare 
outliers (as can be seen in the graph). It was determined that a 5-value median 
filter would significantly reduce the worst case calibrations.

In the results below, stdev % is the standard deviation divided by the average; 
min % is how far is the lowest calibration value measured compared to the 
average and max % is how far is the highest calibration value measured to the 
average.

Base Results:
stdev % min % max %
AWS 0.02 1.4 0.1
hyperv 0.79 6.4 5.5
Azure 2.87 35.1 331.1

Using 5-value Median Filter:
stdev % min % max %
AWS 0.01 0.01 0.03
hyperv 0.47 1.47 1.76
Azure 0.50 2.67 1.39

As we can see, using the median filter significantly reduces the worst-case 
(min/max) mis-calibrations on all platforms, and seems to be a necessity on 
Azure to insure a proper worst-case calibration.

Upstream bug: DLPX-50219
You can view, comment on, or merge this pull request online at:

  https://github.com/openzfs/openzfs/pull/578

-- Commit Summary --

  * DLPX-50219 reduce apic calibration error by taking multiple measurements

-- File Changes --

    M usr/src/uts/i86pc/io/pcplusmp/apic_common.c (116)
    M usr/src/uts/i86pc/io/pcplusmp/apic_timer.c (25)
    M usr/src/uts/i86pc/sys/apic.h (3)
    M usr/src/uts/i86pc/sys/apic_common.h (3)

-- Patch Links --

https://github.com/openzfs/openzfs/pull/578.patch
https://github.com/openzfs/openzfs/pull/578.diff

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/openzfs/openzfs/pull/578

------------------------------------------
openzfs: openzfs-developer
Permalink: 
https://openzfs.topicbox.com/groups/developer/discussions/T01954be8ad5f65fe-M90669209a6366eecd5e4c2ab
Delivery options: https://openzfs.topicbox.com/groups

Reply via email to