On 8/22/2023 11:24 AM, Miroslav Lichvar wrote:
> On Mon, Aug 21, 2023 at 07:02:55PM +0200, Maciek Machnikowski wrote:
>> On 8/21/2023 2:04 PM, Miroslav Lichvar wrote:
>>> s->drift is the drift of the uncompensated clock. It's not supposed to be
>>> scaled by anything.
>> If that's the intended use - then the line 147:
>> s->drift += ki_term;
>>
>> where (line 140)
>> ki_term = s->ki * offset * weight;
>>
>> is incorrect. s->drift is used as the ki accumulator in the SERVO_LOCKED
>> state.
> 
> I don't follow. s->drift accumulates the ki term. That's what the code
> does and is supposed to do. Maybe you are misinterpreting frequency as
> time (integrated frequency)?

If that't the case - why do we set it to the offset initially in case 1
(state S1).
To put it in different words - we set the initial value of the ki term
and then accumulate ki further in the locked state.

The math in the PI controller is defined as
u(t) = Kp*a(t) + Ki * SUM(e(0) to e(t))

What we currently do is we set the whole espression of Ki * SUM(errors)
to the SUM(errors) when switching the states, as that's what set in the
s->drift.

It may cause the controller to converge faster and what we do
effectively is:
u(t) = Kp*a(t) + Ki * SUM(e(0) to e(t)) + initial_drift

What is more - if the initial drift is larger than servo->max_frequency
we can never start accumulating Ki
This should also get another patch - if we fall into the ppb beyond the
-servo->max_frequency condition couple times we should probably reset
the servo.
If the offset never changes its sign (we overshoot the correction) we
can never recover from this stage, as we will always add the s->drift to
the offset causing it to always overflow the max_frequency. See the
example below.

> 
>> I connected my DUT B2B to the GM and ran the "default" ptp4l profile.
>> In this case it wouldn't matter whether I use E2E or P2P, as this is
>> just a B2B connection.
> 
> The absolute value of the delay doesn't matter. E2E is sensitive to
> frequency errors due to long interval between sync and delay
> request messages. P2P has much shorter interval and is much less
> sensitive to frequency errors.
> 
>> To set the offset I used phc_ctl freq 5000000 (which is the max default)
>>
>> I did that test on couple different NICs and in all cases this the clock
>> frequency converged faster and with a smaller swing.
> 
> Please post some logs. I cannot actually reproduce your improvement
> even if I use E2E.
>>
>> How did you test it?
> 
> I used SW timestamping and P2P mechanism, everything else default.
> With HW timestamping I get the same result, but it's less apparent
> than with SW (which uses smaller I constant).
> 
> Here is my log with the current code:
> 
> [105.622] master offset  122869904 s0 freq -500000 path delay      8401
> [106.623] master offset  123562072 s0 freq -500000 path delay      7981
> [107.623] master offset  124253734 s0 freq -500000 path delay      7561
> [108.624] master offset  124938607 s0 freq -500000 path delay      7538
> [109.625] master offset  125627975 s0 freq -500000 path delay      7515
> [110.626] master offset  126320033 s0 freq -500000 path delay      7538
> [111.626] master offset  127013885 s0 freq -500000 path delay      7538
> [112.627] master offset  127698254 s0 freq -500000 path delay      7538
> [113.628] master offset  128390502 s0 freq -500000 path delay      7708
> [114.628] master offset  129080142 s0 freq -500000 path delay      8037
> [115.629] master offset  129773245 s0 freq -500000 path delay      8037
> [116.630] master offset  130460300 s0 freq -500000 path delay      8037
> [117.630] master offset  131148113 s0 freq -500000 path delay      8244
> [118.631] master offset  131838752 s0 freq -500000 path delay      8244
> [119.632] master offset  132532898 s0 freq -500000 path delay      8244
> [120.632] master offset  133219091 s0 freq -500000 path delay      8190
> [121.633] master offset  133908331 s1 freq +189771 path delay      7983
> [122.633] master offset       2239 s2 freq +189997 path delay      7992
> [122.633] port 1 (eth0): UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
> [123.633] master offset       8287 s2 freq +190610 path delay      7992
> [124.633] master offset       2298 s2 freq +190013 path delay      7992
> [125.633] master offset        210 s2 freq +189805 path delay      7992
> [126.633] master offset       -140 s2 freq +189770 path delay      7992
> [127.633] master offset       1205 s2 freq +189905 path delay      7887
> [128.633] master offset       3860 s2 freq +190175 path delay      7887
> [129.633] master offset        475 s2 freq +189837 path delay      7464
> [130.633] master offset       4924 s2 freq +190287 path delay      7887
> [131.633] master offset       2660 s2 freq +190063 path delay      8667
> [132.633] master offset       6939 s2 freq +190498 path delay      8633
> [133.633] master offset        575 s2 freq +189862 path delay      8140
> [134.633] master offset        909 s2 freq +189896 path delay      8140
> [135.633] master offset       -337 s2 freq +189771 path delay      8011
> [136.633] master offset       5063 s2 freq +190316 path delay      7622
> [137.633] master offset       2403 s2 freq +190053 path delay      7622
> [138.633] master offset       2967 s2 freq +190112 path delay      8298
> [139.633] master offset      -1035 s2 freq +189711 path delay      8298
> [140.633] master offset       1932 s2 freq +190009 path delay      7587
> [141.633] master offset       1133 s2 freq +189931 path delay      7587
> 
> Here is a log with your change:
> 
> [196.603] master offset  140133759 s0 freq -500000 path delay      7080
> [197.604] master offset  140824960 s0 freq -500000 path delay      7011
> [198.605] master offset  141519552 s0 freq -500000 path delay      6942
> [199.605] master offset  142205850 s0 freq -500000 path delay      6989
> [200.606] master offset  142900936 s0 freq -500000 path delay      6942
> [201.607] master offset  143589637 s0 freq -500000 path delay      6554
> [202.608] master offset  144278674 s0 freq -500000 path delay      6159
> [203.608] master offset  144965917 s0 freq -500000 path delay      6159
> [204.609] master offset  145659154 s0 freq -500000 path delay      6159
> [205.610] master offset  146348133 s0 freq -500000 path delay      6266
> [206.610] master offset  147038243 s0 freq -500000 path delay      6266
> [207.611] master offset  147730421 s0 freq -500000 path delay      6266
> [208.612] master offset  148416223 s0 freq -500000 path delay      6266
> [209.612] master offset  149114991 s0 freq -500000 path delay      6266
> [210.613] master offset  149796090 s0 freq -500000 path delay      6234
> [211.614] master offset  150486705 s0 freq -500000 path delay      6235
> [212.614] master offset  151176000 s1 freq +190009 path delay      6703
> [213.614] master offset       8247 s2 freq   +1023 path delay      7371
> [213.614] port 1 (eth0): UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
> [214.615] master offset     189966 s2 freq  +19385 path delay      6882
> [215.615] master offset     360050 s2 freq  +36753 path delay      6414
> [216.615] master offset     519758 s2 freq  +53244 path delay      6414
> [217.615] master offset     650323 s2 freq  +66951 path delay      6871
> [218.615] master offset     774050 s2 freq  +80097 path delay      6871
> [219.615] master offset     884426 s2 freq  +92019 path delay      6669
> [220.615] master offset     985458 s2 freq +103108 path delay      6871
> [221.615] master offset    1069770 s2 freq +112609 path delay      7360
> [222.616] master offset    1144598 s2 freq +121236 path delay      7087
> [223.616] master offset    1213409 s2 freq +129331 path delay      7087
> [224.616] master offset    1273759 s2 freq +136640 path delay      7415
> [225.616] master offset    1328472 s2 freq +143439 path delay      7415
> [226.616] master offset    1373375 s2 freq +149303 path delay      7831
> [227.616] master offset    1418269 s2 freq +155211 path delay      7831
> [228.616] master offset    1448623 s2 freq +159695 path delay      7912
> [229.616] master offset    1479092 s2 freq +164221 path delay      7938
> [230.616] master offset    1505500 s2 freq +168367 path delay      7855
> [231.616] master offset    1526318 s2 freq +171975 path delay      7855
> [232.616] master offset    1546528 s2 freq +175543 path delay      7938
> [233.616] master offset    1559112 s2 freq +178360 path delay      7855
> [234.616] master offset    1570490 s2 freq +181069 path delay      7855
> [235.616] master offset    1587432 s2 freq +184350 path delay      7954
> [236.616] master offset    1592368 s2 freq +186436 path delay      7954
> [237.616] master offset    1590949 s2 freq +187885 path delay      7855
> [238.616] master offset    1592922 s2 freq +189675 path delay      7855
> [239.616] master offset    1591448 s2 freq +191120 path delay      7627
> [240.616] master offset    1590311 s2 freq +192596 path delay      7395
> ...
> [569.614] master offset      42852 s2 freq +190263 path delay      7544
> [570.614] master offset      47391 s2 freq +190764 path delay      7195
> [571.614] master offset      44383 s2 freq +190508 path delay      6932
> [572.614] master offset      43157 s2 freq +190428 path delay      6932
> [573.614] master offset      49738 s2 freq +191136 path delay      6894
> [574.614] master offset      43236 s2 freq +190529 path delay      7337
> [575.614] master offset      44324 s2 freq +190682 path delay      6894
> [576.614] master offset      39937 s2 freq +190283 path delay      6894
> [577.614] master offset      39328 s2 freq +190262 path delay      6530
> [578.614] master offset      43251 s2 freq +190697 path delay      6894
> [579.614] master offset      41456 s2 freq +190559 path delay      7093
> [580.614] master offset      36955 s2 freq +190146 path delay      7589
> 
> In the first log you can see how it's immediately close to the 190ppm
> error of the clock, but in the second log it starts at 19 ppm and takes
> hundreds of seconds to get to the 190 ppm. It's completely broken.
> 
>> It's either this, or change the s->drift to continue accumulating the
>> drift and changing the 142 to
>> ppb = s->kp * offset * weight + s->drift * s->ki * weight + ki_term;
> 
> No, I don't think so.
> 

Without this change the clock will never converge:
─
root@equinox:[~]#: phc_ctl ens6f0np0 set freq 50000000
phc_ctl[229.167]: set clock time to 1692699144.713865829 or Tue Aug 22
12:12:24 2023

phc_ctl[229.167]: adjusted clock frequency offset to 50000000.000000ppb
root@equinox:[~]#: sudo ptp4l -m -i ens6f0np0 -2 -s
--tx_timestamp_timeout 100
ptp4l[241.377]: selected /dev/ptp2 as PTP clock
ptp4l[241.399]: port 1: INITIALIZING to LISTENING on INIT_COMPLETE
ptp4l[241.399]: port 0: INITIALIZING to LISTENING on INIT_COMPLETE
ptp4l[242.613]: port 1: new foreign master 08c0eb.fffe.a6ee94-1
ptp4l[246.613]: selected best master clock 08c0eb.fffe.a6ee94
ptp4l[246.613]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[248.614]: master offset -301008874 s0 freq -50000000 path delay
1136508
ptp4l[249.615]: master offset -316872413 s1 freq -50000000 path delay
1460911
ptp4l[250.614]: master offset  -15647977 s2 freq -50000000 path delay
1460911
ptp4l[250.614]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[251.614]: master offset  -31187349 s2 freq -50000000 path delay
1460911
ptp4l[252.614]: master offset  -47001580 s2 freq -50000000 path delay
1737634
ptp4l[253.614]: master offset  -62540884 s2 freq -50000000 path delay
1737634
ptp4l[254.614]: master offset  -78314209 s2 freq -50000000 path delay
1973727
ptp4l[255.615]: master offset  -93853589 s2 freq -50000000 path delay
1973727
ptp4l[256.615]: master offset -109154992 s2 freq -50000000 path delay
1737634
ptp4l[257.615]: master offset -124930409 s2 freq -50000000 path delay
1973727
ptp4l[258.615]: master offset -140704010 s2 freq -50000000 path delay
2209820
ptp4l[259.615]: master offset -156243414 s2 freq -50000000 path delay
2209820
ptp4l[260.615]: master offset -171547209 s2 freq -50000000 path delay
1973727
ptp4l[261.615]: master offset -188705603 s2 freq -50000000 path delay
3593233
ptp4l[262.615]: master offset -204243103 s2 freq -50000000 path delay
3593233
ptp4l[263.615]: master offset -219939180 s2 freq -50000000 path delay
3750022
ptp4l[264.615]: master offset -236535471 s2 freq -50000000 path delay
4808817



While with it it finally will:
ptp4l[385.567]: port 1 (ens6f0np0): INITIALIZING to LISTENING on
INIT_COMPLETE
ptp4l[385.567]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on
INIT_COMPLETE
ptp4l[385.567]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on
INIT_COMPLETE
ptp4l[386.621]: port 1 (ens6f0np0): new foreign master 08c0eb.fffe.a6ee94-1
ptp4l[390.622]: selected best master clock 08c0eb.fffe.a6ee94
ptp4l[390.622]: port 1 (ens6f0np0): LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[391.628]: master offset -2211402026 s0 freq -50000000 path delay
6113768
ptp4l[392.629]: master offset -2226945938 s1 freq -15000000 path delay
6113768
ptp4l[393.629]: master offset   14876038 s2 freq -123962 path delay
6113768
ptp4l[393.629]: port 1 (ens6f0np0): UNCALIBRATED to SLAVE on
MASTER_CLOCK_SELECTED
ptp4l[394.629]: master offset   14999038 s2 freq +4461849 path delay
6113768
ptp4l[395.629]: master offset   11854394 s2 freq +5816917 path delay
4791344
ptp4l[396.629]: master offset    7549066 s2 freq +5067907 path delay
3273272
ptp4l[397.629]: master offset    3991934 s2 freq +3775495 path delay
1755200
ptp4l[398.629]: master offset     244633 s2 freq +1225774 path delay
1719773
ptp4l[399.629]: master offset    -906062 s2 freq +148469 path delay
1637432
ptp4l[400.630]: master offset   -1061338 s2 freq -278626 path delay
1637432
ptp4l[401.630]: master offset    -742384 s2 freq -278073 path delay
1590518
ptp4l[402.630]: master offset     -36997 s2 freq +204599 path delay
1156823
ptp4l[403.630]: master offset     239396 s2 freq +469893 path delay
669594
ptp4l[404.630]: master offset      68730 s2 freq +371045 path delay
364004
ptp4l[405.630]: master offset     -34302 s2 freq +288632 path delay
89516
ptp4l[406.630]: master offset    -306618 s2 freq   +6026 path delay
66740
ptp4l[407.630]: master offset    -311653 s2 freq  -90995 path delay
59255
ptp4l[408.631]: master offset    -216263 s2 freq  -89101 path delay
48445
ptp4l[409.631]: master offset    -110178 s2 freq  -47894 path delay
25076
ptp4l[410.631]: master offset     -68642 s2 freq  -39412 path delay
25076
ptp4l[411.631]: master offset     -35626 s2 freq  -26988 path delay
25076
ptp4l[412.631]: master offset     -15022 s2 freq  -17072 path delay
25076

Thanks,
Maciek



_______________________________________________
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel

Reply via email to