On Wed, Sep 24, 2025 at 08:49:41AM +0200, Samuel Thibault wrote:
> Michael Banck via Bug reports for the GNU Hurd, le mer. 24 sept. 2025 
> 08:41:51 +0200, a ecrit:
> > On Sun, Sep 21, 2025 at 09:14:04AM +0000, Damien Zammit wrote:
> > > Between reading mtime and reading hpclock_read_counter,
> > > there may be an interrupt that updates mtime, therefore
> > > we need a check to perform the clock read process again
> > > in this case.
> > > 
> > > TESTED: on UP using:
> > 
> > There is a PostgreSQL isolation test that seems to be triggered by
> > the clock, while not moving backwards, not moving forward, i.e.
> > reporting the same timestamp twice in a row on subsequent
> > clock_gettime() calls.
> > 
> > If I run the test case from [1] like this, I get, even with this patch
> > applied after a few thousand iterations:
> > 
> > |$ for i in {1..10000}; do printf "ITERATION $i "; ./tt 100 || break; done
> > [...]
> > | ITERATION 3029 t1: -2073074700, t2: -2073069580, t2 - t1: 5120 (r: 4950)
> > | ITERATION 3030 t1: -2070257921, t2: -2070257921, t2 - t1: 0 (r: 4950)
> 
> Yes, that can still happen with the current implementation: we really
> advance the time on clock tick, and use the hpet only to interpolate
> the time between the ticks. If for whatever reason the clock and the
> hpet are not perfectly synchronized, we clamp the advance so that the
> reported time stays monotonic. So two consecutive calls may report the
> same value. Trying to make sure that time always progress at least by
> 1ns (or 1µs if the application is using gettimeofday...) would be quite
> involved.
> 
> I'd tend to say this is an issue in postgresql: it shouldn't assume that
> clocks have infinite precision.

I guess there is a spectrum here - certainly infinte precision is
unrealistic, but the question is what kind of minimum timer precision
applications can require (I've asked what the current requirements from
Postgres are).

Postgres has a pg_test_timing utility, the version in master/HEAD has
been enhanced to use ns resolution, and there I get on my qemu/kvm VM:

$ LANG=C ./pg_test_timing
Testing timing overhead for 3 seconds.
Average loop time including overhead: 13866,64 ns
Histogram of timing durations:
   <= ns   % of total  running %      count
       0       0,0510     0,0510        122
       1       0,0000     0,0510          0
       3       0,0000     0,0510          0
       7       0,0000     0,0510          0
      15       0,0000     0,0510          0
      31       0,0000     0,0510          0
      63       0,0000     0,0510          0
     127       0,0000     0,0510          0
     255       0,0000     0,0510          0
     511       0,0000     0,0510          0
    1023       0,0004     0,0514          1
    2047       0,0000     0,0514          0
    4095      98,9320    98,9834     236681
    8191       0,8845    99,8679       2116
   16383       0,0393    99,9072         94
   32767       0,0343    99,9415         82
[...]
536870911       0,0004    99,9996          1
1073741823       0,0004   100,0000          1

Observed timing durations up to 99,9900%:
      ns   % of total  running %      count
       0       0,0510     0,0510        122
     729       0,0004     0,0514          1
    3519       0,0004     0,0518          1
    3630       0,0130     0,0648         31
    3640       0,1651     0,2299        395
    3650       0,7449     0,9748       1782
    3660       2,3395     3,3143       5597
[...]
    9980       0,0004    99,8892          1
    9990       0,0004    99,8896          1
...
782724560       0,0004   100,0000          1

Where as on my Linux Thinkpad host, I see this:

$ LANG=C ./pg_test_timing
Testing timing overhead for 3 seconds.
Average loop time including overhead: 13.84 ns
Histogram of timing durations:
   <= ns   % of total  running %      count
       0       0.0000     0.0000          0
       1       0.0000     0.0000          0
       3       0.0000     0.0000          0
       7       0.0000     0.0000          0
      15      97.3170    97.3170  210936922
      31       2.6288    99.9458    5697932
      63       0.0505    99.9963     109441
     127       0.0008    99.9971       1782
     255       0.0022    99.9992       4674
     511       0.0004    99.9996        789
    1023       0.0002    99.9998        448
    2047       0.0001    99.9999        260
    4095       0.0000    99.9999         29
    8191       0.0000   100.0000         28
   16383       0.0000   100.0000          9
   32767       0.0000   100.0000         20
   65535       0.0000   100.0000         71
  131071       0.0000   100.0000          1

Observed timing durations up to 99.9900%:
      ns   % of total  running %      count
      12       0.6013     0.6013    1303308
      13      30.4999    31.1012   66109219
      14      61.5536    92.6548  133418976
      15       4.6622    97.3170   10105419
      16       1.5844    98.9014    3434318
      17       0.6401    99.5415    1387333
      18       0.1352    99.6766     292948
      19       0.1831    99.8597     396853
      20       0.0499    99.9097     108239
      21       0.0094    99.9191      20358
      22       0.0056    99.9247      12191
      23       0.0065    99.9312      14037
      24       0.0072    99.9384      15645
      25       0.0023    99.9407       4988
      26       0.0008    99.9415       1819
      27       0.0004    99.9419        863
      28       0.0008    99.9427       1706
      29       0.0007    99.9434       1486
      30       0.0006    99.9440       1296
      31       0.0018    99.9458       3852
      32       0.0018    99.9476       3884
      33       0.0003    99.9479        675
      34       0.0001    99.9480        180
      35       0.0001    99.9480        120
      36       0.0000    99.9480         42
      37       0.0000    99.9480         38
      38       0.0000    99.9481         38
      39       0.0000    99.9481         30
      40       0.0009    99.9489       1885
      41       0.0046    99.9536      10039
      42       0.0137    99.9673      29692
      43       0.0157    99.9830      34041
      44       0.0089    99.9918      19219
...
   95775       0.0000   100.0000          1

(I should get a Linux VM running on qemu/kvm and compare timings there)

Those 0ns on qemu are the problem for the (probably artificial) stats
isolation test, but the Postgres hackers are also very unhappy about
proposing random sleep delays in their testsuite.


Michael

Reply via email to