On 8/22/2023 11:24 AM, Miroslav Lichvar wrote: > On Mon, Aug 21, 2023 at 07:02:55PM +0200, Maciek Machnikowski wrote: >> On 8/21/2023 2:04 PM, Miroslav Lichvar wrote: >>> s->drift is the drift of the uncompensated clock. It's not supposed to be >>> scaled by anything. >> If that's the intended use - then the line 147: >> s->drift += ki_term; >> >> where (line 140) >> ki_term = s->ki * offset * weight; >> >> is incorrect. s->drift is used as the ki accumulator in the SERVO_LOCKED >> state. > > I don't follow. s->drift accumulates the ki term. That's what the code > does and is supposed to do. Maybe you are misinterpreting frequency as > time (integrated frequency)?
If that't the case - why do we set it to the offset initially in case 1 (state S1). To put it in different words - we set the initial value of the ki term and then accumulate ki further in the locked state. The math in the PI controller is defined as u(t) = Kp*a(t) + Ki * SUM(e(0) to e(t)) What we currently do is we set the whole espression of Ki * SUM(errors) to the SUM(errors) when switching the states, as that's what set in the s->drift. It may cause the controller to converge faster and what we do effectively is: u(t) = Kp*a(t) + Ki * SUM(e(0) to e(t)) + initial_drift What is more - if the initial drift is larger than servo->max_frequency we can never start accumulating Ki This should also get another patch - if we fall into the ppb beyond the -servo->max_frequency condition couple times we should probably reset the servo. If the offset never changes its sign (we overshoot the correction) we can never recover from this stage, as we will always add the s->drift to the offset causing it to always overflow the max_frequency. See the example below. > >> I connected my DUT B2B to the GM and ran the "default" ptp4l profile. >> In this case it wouldn't matter whether I use E2E or P2P, as this is >> just a B2B connection. > > The absolute value of the delay doesn't matter. E2E is sensitive to > frequency errors due to long interval between sync and delay > request messages. P2P has much shorter interval and is much less > sensitive to frequency errors. > >> To set the offset I used phc_ctl freq 5000000 (which is the max default) >> >> I did that test on couple different NICs and in all cases this the clock >> frequency converged faster and with a smaller swing. > > Please post some logs. I cannot actually reproduce your improvement > even if I use E2E. >> >> How did you test it? > > I used SW timestamping and P2P mechanism, everything else default. > With HW timestamping I get the same result, but it's less apparent > than with SW (which uses smaller I constant). > > Here is my log with the current code: > > [105.622] master offset 122869904 s0 freq -500000 path delay 8401 > [106.623] master offset 123562072 s0 freq -500000 path delay 7981 > [107.623] master offset 124253734 s0 freq -500000 path delay 7561 > [108.624] master offset 124938607 s0 freq -500000 path delay 7538 > [109.625] master offset 125627975 s0 freq -500000 path delay 7515 > [110.626] master offset 126320033 s0 freq -500000 path delay 7538 > [111.626] master offset 127013885 s0 freq -500000 path delay 7538 > [112.627] master offset 127698254 s0 freq -500000 path delay 7538 > [113.628] master offset 128390502 s0 freq -500000 path delay 7708 > [114.628] master offset 129080142 s0 freq -500000 path delay 8037 > [115.629] master offset 129773245 s0 freq -500000 path delay 8037 > [116.630] master offset 130460300 s0 freq -500000 path delay 8037 > [117.630] master offset 131148113 s0 freq -500000 path delay 8244 > [118.631] master offset 131838752 s0 freq -500000 path delay 8244 > [119.632] master offset 132532898 s0 freq -500000 path delay 8244 > [120.632] master offset 133219091 s0 freq -500000 path delay 8190 > [121.633] master offset 133908331 s1 freq +189771 path delay 7983 > [122.633] master offset 2239 s2 freq +189997 path delay 7992 > [122.633] port 1 (eth0): UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED > [123.633] master offset 8287 s2 freq +190610 path delay 7992 > [124.633] master offset 2298 s2 freq +190013 path delay 7992 > [125.633] master offset 210 s2 freq +189805 path delay 7992 > [126.633] master offset -140 s2 freq +189770 path delay 7992 > [127.633] master offset 1205 s2 freq +189905 path delay 7887 > [128.633] master offset 3860 s2 freq +190175 path delay 7887 > [129.633] master offset 475 s2 freq +189837 path delay 7464 > [130.633] master offset 4924 s2 freq +190287 path delay 7887 > [131.633] master offset 2660 s2 freq +190063 path delay 8667 > [132.633] master offset 6939 s2 freq +190498 path delay 8633 > [133.633] master offset 575 s2 freq +189862 path delay 8140 > [134.633] master offset 909 s2 freq +189896 path delay 8140 > [135.633] master offset -337 s2 freq +189771 path delay 8011 > [136.633] master offset 5063 s2 freq +190316 path delay 7622 > [137.633] master offset 2403 s2 freq +190053 path delay 7622 > [138.633] master offset 2967 s2 freq +190112 path delay 8298 > [139.633] master offset -1035 s2 freq +189711 path delay 8298 > [140.633] master offset 1932 s2 freq +190009 path delay 7587 > [141.633] master offset 1133 s2 freq +189931 path delay 7587 > > Here is a log with your change: > > [196.603] master offset 140133759 s0 freq -500000 path delay 7080 > [197.604] master offset 140824960 s0 freq -500000 path delay 7011 > [198.605] master offset 141519552 s0 freq -500000 path delay 6942 > [199.605] master offset 142205850 s0 freq -500000 path delay 6989 > [200.606] master offset 142900936 s0 freq -500000 path delay 6942 > [201.607] master offset 143589637 s0 freq -500000 path delay 6554 > [202.608] master offset 144278674 s0 freq -500000 path delay 6159 > [203.608] master offset 144965917 s0 freq -500000 path delay 6159 > [204.609] master offset 145659154 s0 freq -500000 path delay 6159 > [205.610] master offset 146348133 s0 freq -500000 path delay 6266 > [206.610] master offset 147038243 s0 freq -500000 path delay 6266 > [207.611] master offset 147730421 s0 freq -500000 path delay 6266 > [208.612] master offset 148416223 s0 freq -500000 path delay 6266 > [209.612] master offset 149114991 s0 freq -500000 path delay 6266 > [210.613] master offset 149796090 s0 freq -500000 path delay 6234 > [211.614] master offset 150486705 s0 freq -500000 path delay 6235 > [212.614] master offset 151176000 s1 freq +190009 path delay 6703 > [213.614] master offset 8247 s2 freq +1023 path delay 7371 > [213.614] port 1 (eth0): UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED > [214.615] master offset 189966 s2 freq +19385 path delay 6882 > [215.615] master offset 360050 s2 freq +36753 path delay 6414 > [216.615] master offset 519758 s2 freq +53244 path delay 6414 > [217.615] master offset 650323 s2 freq +66951 path delay 6871 > [218.615] master offset 774050 s2 freq +80097 path delay 6871 > [219.615] master offset 884426 s2 freq +92019 path delay 6669 > [220.615] master offset 985458 s2 freq +103108 path delay 6871 > [221.615] master offset 1069770 s2 freq +112609 path delay 7360 > [222.616] master offset 1144598 s2 freq +121236 path delay 7087 > [223.616] master offset 1213409 s2 freq +129331 path delay 7087 > [224.616] master offset 1273759 s2 freq +136640 path delay 7415 > [225.616] master offset 1328472 s2 freq +143439 path delay 7415 > [226.616] master offset 1373375 s2 freq +149303 path delay 7831 > [227.616] master offset 1418269 s2 freq +155211 path delay 7831 > [228.616] master offset 1448623 s2 freq +159695 path delay 7912 > [229.616] master offset 1479092 s2 freq +164221 path delay 7938 > [230.616] master offset 1505500 s2 freq +168367 path delay 7855 > [231.616] master offset 1526318 s2 freq +171975 path delay 7855 > [232.616] master offset 1546528 s2 freq +175543 path delay 7938 > [233.616] master offset 1559112 s2 freq +178360 path delay 7855 > [234.616] master offset 1570490 s2 freq +181069 path delay 7855 > [235.616] master offset 1587432 s2 freq +184350 path delay 7954 > [236.616] master offset 1592368 s2 freq +186436 path delay 7954 > [237.616] master offset 1590949 s2 freq +187885 path delay 7855 > [238.616] master offset 1592922 s2 freq +189675 path delay 7855 > [239.616] master offset 1591448 s2 freq +191120 path delay 7627 > [240.616] master offset 1590311 s2 freq +192596 path delay 7395 > ... > [569.614] master offset 42852 s2 freq +190263 path delay 7544 > [570.614] master offset 47391 s2 freq +190764 path delay 7195 > [571.614] master offset 44383 s2 freq +190508 path delay 6932 > [572.614] master offset 43157 s2 freq +190428 path delay 6932 > [573.614] master offset 49738 s2 freq +191136 path delay 6894 > [574.614] master offset 43236 s2 freq +190529 path delay 7337 > [575.614] master offset 44324 s2 freq +190682 path delay 6894 > [576.614] master offset 39937 s2 freq +190283 path delay 6894 > [577.614] master offset 39328 s2 freq +190262 path delay 6530 > [578.614] master offset 43251 s2 freq +190697 path delay 6894 > [579.614] master offset 41456 s2 freq +190559 path delay 7093 > [580.614] master offset 36955 s2 freq +190146 path delay 7589 > > In the first log you can see how it's immediately close to the 190ppm > error of the clock, but in the second log it starts at 19 ppm and takes > hundreds of seconds to get to the 190 ppm. It's completely broken. > >> It's either this, or change the s->drift to continue accumulating the >> drift and changing the 142 to >> ppb = s->kp * offset * weight + s->drift * s->ki * weight + ki_term; > > No, I don't think so. > Without this change the clock will never converge: ─ root@equinox:[~]#: phc_ctl ens6f0np0 set freq 50000000 phc_ctl[229.167]: set clock time to 1692699144.713865829 or Tue Aug 22 12:12:24 2023 phc_ctl[229.167]: adjusted clock frequency offset to 50000000.000000ppb root@equinox:[~]#: sudo ptp4l -m -i ens6f0np0 -2 -s --tx_timestamp_timeout 100 ptp4l[241.377]: selected /dev/ptp2 as PTP clock ptp4l[241.399]: port 1: INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[241.399]: port 0: INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[242.613]: port 1: new foreign master 08c0eb.fffe.a6ee94-1 ptp4l[246.613]: selected best master clock 08c0eb.fffe.a6ee94 ptp4l[246.613]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[248.614]: master offset -301008874 s0 freq -50000000 path delay 1136508 ptp4l[249.615]: master offset -316872413 s1 freq -50000000 path delay 1460911 ptp4l[250.614]: master offset -15647977 s2 freq -50000000 path delay 1460911 ptp4l[250.614]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED ptp4l[251.614]: master offset -31187349 s2 freq -50000000 path delay 1460911 ptp4l[252.614]: master offset -47001580 s2 freq -50000000 path delay 1737634 ptp4l[253.614]: master offset -62540884 s2 freq -50000000 path delay 1737634 ptp4l[254.614]: master offset -78314209 s2 freq -50000000 path delay 1973727 ptp4l[255.615]: master offset -93853589 s2 freq -50000000 path delay 1973727 ptp4l[256.615]: master offset -109154992 s2 freq -50000000 path delay 1737634 ptp4l[257.615]: master offset -124930409 s2 freq -50000000 path delay 1973727 ptp4l[258.615]: master offset -140704010 s2 freq -50000000 path delay 2209820 ptp4l[259.615]: master offset -156243414 s2 freq -50000000 path delay 2209820 ptp4l[260.615]: master offset -171547209 s2 freq -50000000 path delay 1973727 ptp4l[261.615]: master offset -188705603 s2 freq -50000000 path delay 3593233 ptp4l[262.615]: master offset -204243103 s2 freq -50000000 path delay 3593233 ptp4l[263.615]: master offset -219939180 s2 freq -50000000 path delay 3750022 ptp4l[264.615]: master offset -236535471 s2 freq -50000000 path delay 4808817 While with it it finally will: ptp4l[385.567]: port 1 (ens6f0np0): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[385.567]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[385.567]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[386.621]: port 1 (ens6f0np0): new foreign master 08c0eb.fffe.a6ee94-1 ptp4l[390.622]: selected best master clock 08c0eb.fffe.a6ee94 ptp4l[390.622]: port 1 (ens6f0np0): LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[391.628]: master offset -2211402026 s0 freq -50000000 path delay 6113768 ptp4l[392.629]: master offset -2226945938 s1 freq -15000000 path delay 6113768 ptp4l[393.629]: master offset 14876038 s2 freq -123962 path delay 6113768 ptp4l[393.629]: port 1 (ens6f0np0): UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED ptp4l[394.629]: master offset 14999038 s2 freq +4461849 path delay 6113768 ptp4l[395.629]: master offset 11854394 s2 freq +5816917 path delay 4791344 ptp4l[396.629]: master offset 7549066 s2 freq +5067907 path delay 3273272 ptp4l[397.629]: master offset 3991934 s2 freq +3775495 path delay 1755200 ptp4l[398.629]: master offset 244633 s2 freq +1225774 path delay 1719773 ptp4l[399.629]: master offset -906062 s2 freq +148469 path delay 1637432 ptp4l[400.630]: master offset -1061338 s2 freq -278626 path delay 1637432 ptp4l[401.630]: master offset -742384 s2 freq -278073 path delay 1590518 ptp4l[402.630]: master offset -36997 s2 freq +204599 path delay 1156823 ptp4l[403.630]: master offset 239396 s2 freq +469893 path delay 669594 ptp4l[404.630]: master offset 68730 s2 freq +371045 path delay 364004 ptp4l[405.630]: master offset -34302 s2 freq +288632 path delay 89516 ptp4l[406.630]: master offset -306618 s2 freq +6026 path delay 66740 ptp4l[407.630]: master offset -311653 s2 freq -90995 path delay 59255 ptp4l[408.631]: master offset -216263 s2 freq -89101 path delay 48445 ptp4l[409.631]: master offset -110178 s2 freq -47894 path delay 25076 ptp4l[410.631]: master offset -68642 s2 freq -39412 path delay 25076 ptp4l[411.631]: master offset -35626 s2 freq -26988 path delay 25076 ptp4l[412.631]: master offset -15022 s2 freq -17072 path delay 25076 Thanks, Maciek _______________________________________________ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel