On Tue, May 11, 2021 at 02:16:04PM +0000, Amar Subramanyam wrote: > What happens here exactly is, due to the continuous triggering of BMCA, the > value of mono_interval (the interval between two successive calls of > clock_check_sample ()) gets increased and SYNCRONIZATION FAULT occurs with > the default value of sanity_freq_limit (which is 200000000). Modifying the > configuration with --sanity_freq_limit=0 will prevent the FAULT from > occurring , but it will not address the root cause of BMCA getting triggered > continuously even though there is no change in successive announce messages > in the port. So we believe that setting --sanity_freq_limit=0 is a work > around and doesn't directly solve the issue. > Hence the changes we introduced are such that BMCA is not triggered at all > when there is no change in successive announce messages.
Ok, so it is the clock check as I suspected. I don't see how it is related to the BMCA or the announce timeout. The clock check is active when the clock is in a synchronized state and it checks RX timestamps of event messages. If the clocks were not synchronized, sync messages received on different ports failed the check. That's what I saw in my test, even with your patch applied. There is a race condition with phc2sys. It may not be fast enough to sync the other clock before it receives an event message. I think the fix should be one of the following: - disable clock check in jbod mode (it cannot work reliably as it is) - limit the check to timestamps from the synchronized port - have a separate clock check instance for each clock, checking only its own timestamps > > There might be a better name for this function. Maybe something related to > > its purpose rather than what it does. > > Is the name "clock_get_port_client_state" fine?. Could you please propose any > new suggestions? Maybe something like clock_non_client_port_announce_timer would work better? > > Ok, but if this optimization is useful in the jbod mode, it should be > > useful even in the non-jbod mode, right? Most of the port code shouldn't > > care about jbod. > > Yes, as you suggested this change is useful in both jbod and non-jbod mode to > avoid unnecessary triggering of BMCA. But there is no impact seen in non jbod > case as there is only one port. Whereas there is clear impact on the SLAVE > port in jbod, slaveOnly case, as explained earlier. We didn't want to > introduce any new variables to the non jbod mode, hence we restricted our > change to the jbod mode alone. I think it could work as an optimization to avoid unnecessary calls of BCMA and spam in the log, but that shouldn't be specific to the jbod mode. -- Miroslav Lichvar _______________________________________________ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel