> On 4 May 2022, at 11:33, Miroslav Lichvar <mlich...@redhat.com> wrote:
> 
Hi Miroslav,
Thank you for your response. I appreciate it.
> On Tue, May 03, 2022 at 02:26:21PM +0000, Oleg Obleukhov via Linuxptp-users 
> wrote:
>> Hi team,
>> In large distributed networks very many factors can lead to a short term 
>> spike in offset. Primarily network equipment without Transparent Clock 
>> support (even on a single device).
> 
> PTP was designed for networks with constant delay. On switched
> networks that requires full on-path PTP support. If you don't have
> that, you should be looking at NTP or another protocol designed for
> networks with variable delays, where more effective filtering can be
> implemented.
While we are phasing out old equipment the reality is - there will be always 
some % of misbehaving/old switches in large distributed systems with thousands 
switches on the way. During congestion which only lasts several microseconds we 
may be affected and we need to survive. 
> 
> Of course, that doesn't mean linuxptp couldn't try to do better in
> these suboptimal conditions. The question is if it's in the scope of
> the project. As you seem to have found out, the main issue with the
> current design is that dropping samples can lead to servo instability.
> 
>> Looking at ptp4l config I didn’t to find anything to overcome this situation 
>> and ignore this 1 bad outlier.
>> I implemented a quick patch 
>> https://gist.github.com/leoleovich/5a4dff7e089bd429c5d208d9276e1683 which 
>> can mitigate this and it works very well:
> 
>> Preventing unnecessary tuning of the servo for a short period of time by 
>> using a padding technique (simply filling with previous values).
The patch I proposed simply doesn’t pass the offset to a servo - so it 
shouldn’t be too bad. For example with default ptp4l settings we can tolerate 
several missed syncs in a row. But I am open for suggestions of course.
> 
> That patch seems to be dropping the sample and there is a different
> output shown in the example. Is there a newer version of the patch you
> didn't publish?
The code I suggested matches the output. It simply prints something like:
skip 1/2 large offset (>20000) -248483
When occasional spikes arise. The only difference is max_offset_locked and 
max_offset_locked_skip should be set to 0 and currently they are at 20000  and 
2 respectively. 
> 
>> The bottom line is - we need to find a way to ignore outliers in a locked 
>> state where it’s not expected to have shot term large jumps in offset.
>> Please check this out and let me know if there is a better way to handle 
>> this situation or if this patch can inspire any other ideas…
> 
> If a spike filter needs to be implemented, I think it would better if
> the threshold was automatically adjusted based on the jitter. For an
> example, see the "Popcorn spike suppressor" in RFC5905 (NTPv4).
Automatically adjusted filter is something even better. If you open for such 
idea we can discuss this as well. I wanted to start somewhere.
> 
> -- 
> Miroslav Lichvar
> 
Thank you,
Oleg.
_______________________________________________
Linuxptp-users mailing list
Linuxptp-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-users

Reply via email to