Hi Jake, On Tue, 1 Oct 2019 at 00:14, Keller, Jacob E <jacob.e.kel...@intel.com> wrote: > > > -----Original Message----- > > From: Vladimir Oltean [mailto:olte...@gmail.com] > > Sent: Saturday, September 28, 2019 5:34 AM > > To: richardcoch...@gmail.com > > Cc: linuxptp-devel@lists.sourceforge.net > > Subject: [Linuxptp-devel] [PATCH] port: Deal with higher-order > > sync/follow-up > > reordering > > > > So a new port config option was introduced, called > > sync_follow_up_history, with a default of 0 that keeps the same behavior > > as what was previously intended (and which nobody apparently complained > > about). So how does this solve the live lock? It doesn't: > > > > ptp4l[7502.451]: rms 4 max 8 freq +17765 +/- 8 delay 489 +/- 1 > > ptp4l[7504.474]: rms 4 max 10 freq +17764 +/- 9 delay 489 +/- 1 > > ptp4l[7504.912]: Tail-dropping sync 12899 due to reordering > > ptp4l[7504.914]: Tail-dropping follow-up 12899 due to reordering > > ptp4l[7504.944]: Tail-dropping sync 12900 due to reordering > > ptp4l[7504.975]: Tail-dropping sync 12901 due to reordering > > ptp4l[7504.977]: Tail-dropping follow-up 12900 due to reordering > > ptp4l[7505.007]: Tail-dropping sync 12902 due to reordering > > > > The (important) differences are that: > > - The user at least now *knows* what is going on. Previously the only > > behavior was that ptp4l was silently dropping frames and > > synchronization halted. This is still pretty much fatal even with this > > patch, as long as the network keeps pushing frame sequences as above, > > but right now it is much more verbose. > > - The user has a knob to turn to fix this: increase > > sync_follow_up_history while striking an acceptable balance with > > logSyncInterval. > > > > So what's the argument for not increasing the default history depth a little > to help alleviate the need to reconfigure it? >
I just couldn't find a good enough default value other than zero. The amount of sync frames you may want to buffer as a slave or bridge depends too much on the sync interval of the master. Also, maybe your network is perfect and never reorders frames, just drops them, and then why would linuxptp default to buffer them. > I do agree this is a better approach and better messaging than before, and it > is significantly more clear what's going wrong. > My main concern, on the other hand, is that I can already hear Richard saying 'fix your driver', and I think that I even do partly agree on that. Although in more complicated topologies, fixing "all the drivers" might not be as easy, and since linuxptp already claimed it dealt with reordering, I figured why not do it this way. I will admit I haven't opened "the book" for this one to see if IEEE 1588 has anything to say about the FSM that a slave should implement for receiving potentially out-of-order frames. I just assumed it doesn't. So after enough runtime (more than 1 day) I'm still getting some synchronization hangs even with this patch and a history depth of 4, although I get no reordering messages - so I don't think the other hangs are related. Just FYI that my issues are not completely addressed with this one patch. > Could we add a section to the man page describing this option and when it > might make sense? The tail dropping due to re-ordering is clear but may not > immediately be obvious to users which knobs to use to help resolve the > problem. > The patch does indeed touch ptp4l.8. Did you notice that and think it could use some improvement? > Thanks, > Jake > [snip] Regards, -Vladimir _______________________________________________ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel