On Tue, May 11, 2021 at 02:16:04PM +0000, Amar Subramanyam wrote:
> What happens here exactly is, due to the continuous triggering of  BMCA, the 
> value of mono_interval (the interval between two successive calls of 
> clock_check_sample ()) gets increased and SYNCRONIZATION FAULT occurs with 
> the default value of sanity_freq_limit (which is 200000000). Modifying the 
> configuration with --sanity_freq_limit=0 will prevent the FAULT from 
> occurring , but it will not address the root cause of BMCA getting triggered 
> continuously even though there is no change in successive announce messages 
> in the port. So we believe that setting --sanity_freq_limit=0 is a work 
> around and doesn't directly solve the issue.
> Hence the changes we introduced are such that BMCA is not triggered at all 
> when there is no change in successive announce messages.

Ok, so it is the clock check as I suspected. I don't see how it is
related to the BMCA or the announce timeout. The clock check is active
when the clock is in a synchronized state and it checks RX timestamps
of event messages.

If the clocks were not synchronized, sync messages received on
different ports failed the check. That's what I saw in my test, even
with your patch applied.

There is a race condition with phc2sys. It may not be fast enough to
sync the other clock before it receives an event message.

I think the fix should be one of the following:
- disable clock check in jbod mode (it cannot work reliably as it is)
- limit the check to timestamps from the synchronized port
- have a separate clock check instance for each clock, checking only
  its own timestamps

> > There might be a better name for this function. Maybe something related to 
> > its purpose rather than what it does.
> 
> Is the name "clock_get_port_client_state" fine?. Could you please propose any 
> new suggestions?

Maybe something like clock_non_client_port_announce_timer would work
better?

> > Ok, but if this optimization is useful in the jbod mode, it should be 
> > useful even in the non-jbod mode, right? Most of the port code shouldn't 
> > care about jbod.
> 
> Yes, as you suggested this change is useful in both jbod and non-jbod mode to 
> avoid unnecessary triggering of BMCA. But there is no impact seen in non jbod 
> case as there is only one port. Whereas there is clear impact on the SLAVE 
> port in jbod, slaveOnly case, as explained earlier. We didn't want to 
> introduce any new variables to the non jbod mode, hence we restricted our 
> change to the jbod mode alone.

I think it could work as an optimization to avoid unnecessary calls of
BCMA and spam in the log, but that shouldn't be specific to the jbod
mode.

-- 
Miroslav Lichvar



_______________________________________________
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel

Reply via email to