Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

2021-05-21 Thread Amar Subramanyam via Linuxptp-devel
Hi Miroslav Lichvar,

Please find our comments inline.

Thanks,
Amar B S

> -Original Message-
> From: Miroslav Lichvar [mailto:mlich...@redhat.com]
> Sent: 20 May 2021 18:43
> To: Amar Subramanyam 
> Cc: linuxptp-devel@lists.sourceforge.net
> Subject: Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran
> with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)
> 
> CAUTION: This email originated from outside of Altiostar. Do not click on 
> links
> or open attachments unless you recognize the sender and you are sure the
> content is safe. You will never be asked to reset your Altiostar password via
> email.
> 
> 
> On Mon, May 17, 2021 at 10:02:59AM +0300, Amar Subramanyam via Linuxptp-
> devel wrote:
> > This patch addresses the following issues when ptp4l is run on
> > multiple ports with jbod and client only mode (i.e. clientOnly=1 and
> > boundary_clock_jbod=1):-
> >
> >   1.The LISTENING port prints continuously
> >   "selected best master clock 00..03
> >   updating UTC offset to 37"
> >
> >   We limited the log such that now it prints only when there is a
> >   change in the best-master clock.
> >
> >   2.The port other than SLAVE (LISTENING port) prints an error
> >   "port 1: master state recommended in slave only mode
> >   ptp4l[1205469.356]: port 1: defaultDS.priority1 probably 
> > misconfigured"
> >   for every ANNOUNCE RECEIPT Timeout.
> 
> I think it would be better to have separate patches for the two issues.

We will send out separate patches for these issues.

> 
> > -static void clock_update_slave(struct clock *c)
> > +static void clock_update_slave(struct clock *c, int mdiff)
> >  {
> >   struct parentDS *pds = >dad.pds;
> >   struct ptp_message *msg;
> >
> > - if (!c->best)
> > + if (!c->best || !mdiff)
> >   return;
> 
> Instead of adding and checking the mdiff parameter here, not doing anything,
> why not just not call the function?

This is another way of handling but intent is not to do checks explicitly every 
time function is called.
We can change it the other way round.

> 
> > diff --git a/port.c b/port.c
> > index 10bb9e1..650ca00 100644
> > --- a/port.c
> > +++ b/port.c
> > @@ -2531,7 +2531,7 @@ void port_dispatch(struct port *p, enum
> > fsm_event event, int mdiff)  static void bc_dispatch(struct port *p,
> > enum fsm_event event, int mdiff)  {
> >   if (clock_slave_only(p->clock)) {
> > - if (event == EV_RS_MASTER || event == EV_RS_GRAND_MASTER) {
> > + if (event == EV_RS_GRAND_MASTER) {
> >   port_slave_priority_warning(p);
> >   }
> >   }
> 
> Makes sense to me.
> 
> There is a comment in bmc_state_decision() referring to this code, which might
> need to be updated with this change, or maybe that whole check specific to the
> Automotive profile could be removed if the log message was the only reason for
> it (I'm not sure).

Yes, we think the log message was the only reason for it being added. But the 
comment and check in bmc_state_master() is still valid and no changes are 
required.
Designated slave only mode throws EVENT_RS_GRAND_MASTER and not the event 
EV_RS_MASTER, when there is no foreign master.
Our change doesn't affect the comment and check in bmc_state_decision()

> 
> Thanks,
> 
> --
> Miroslav Lichvar



___
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel


Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

2021-05-20 Thread Miroslav Lichvar
On Mon, May 17, 2021 at 10:02:59AM +0300, Amar Subramanyam via Linuxptp-devel 
wrote:
> This patch addresses the following issues when ptp4l is run on multiple ports
> with jbod and client only mode (i.e. clientOnly=1 and boundary_clock_jbod=1):-
> 
>   1.The LISTENING port prints continuously
>   "selected best master clock 00..03
>   updating UTC offset to 37"
> 
>   We limited the log such that now it prints only when there is a
>   change in the best-master clock.
> 
>   2.The port other than SLAVE (LISTENING port) prints an error
>   "port 1: master state recommended in slave only mode
>   ptp4l[1205469.356]: port 1: defaultDS.priority1 probably misconfigured"
>   for every ANNOUNCE RECEIPT Timeout.

I think it would be better to have separate patches for the two
issues.

> -static void clock_update_slave(struct clock *c)
> +static void clock_update_slave(struct clock *c, int mdiff)
>  {
>   struct parentDS *pds = >dad.pds;
>   struct ptp_message *msg;
>  
> - if (!c->best)
> + if (!c->best || !mdiff)
>   return;

Instead of adding and checking the mdiff parameter here, not doing
anything, why not just not call the function?

> diff --git a/port.c b/port.c
> index 10bb9e1..650ca00 100644
> --- a/port.c
> +++ b/port.c
> @@ -2531,7 +2531,7 @@ void port_dispatch(struct port *p, enum fsm_event 
> event, int mdiff)
>  static void bc_dispatch(struct port *p, enum fsm_event event, int mdiff)
>  {
>   if (clock_slave_only(p->clock)) {
> - if (event == EV_RS_MASTER || event == EV_RS_GRAND_MASTER) {
> + if (event == EV_RS_GRAND_MASTER) {
>   port_slave_priority_warning(p);
>   }
>   }

Makes sense to me.

There is a comment in bmc_state_decision() referring to this code,
which might need to be updated with this change, or maybe that whole
check specific to the Automotive profile could be removed if the log
message was the only reason for it (I'm not sure).

Thanks,

-- 
Miroslav Lichvar



___
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel


Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

2021-05-12 Thread Richard Cochran
On Wed, May 12, 2021 at 10:28:34AM +0200, Miroslav Lichvar wrote:
> I think the fix should be one of the following:
> - disable clock check in jbod mode (it cannot work reliably as it is)
> - limit the check to timestamps from the synchronized port

This sounds like the best choice to me.

> - have a separate clock check instance for each clock, checking only
>   its own timestamps

Thanks,
Richard


___
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel


Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

2021-05-12 Thread Richard Cochran
On Wed, May 12, 2021 at 10:57:24AM +, Amar Subramanyam wrote:
> Should we rate limit log (a) as it will be printed whenever BMCA is triggered 
> and avoid log (b) when boundary_clock_jbod=1 and clientOnly=1

Sounds reasonable to me.

Thanks,
Richard



___
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel


Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

2021-05-12 Thread Amar Subramanyam via Linuxptp-devel
Hi Richard,

Kindly discard the previous reply. Please find our response inline.

Thanks,
Amar B S


-Original Message-
From: Richard Cochran [mailto:richardcoch...@gmail.com] 
Sent: 11 May 2021 21:01
To: Amar Subramanyam 
Cc: Miroslav Lichvar ; linuxptp-devel@lists.sourceforge.net
Subject: Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran 
with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

CAUTION: This email originated from outside of Altiostar. Do not click on links 
or open attachments unless you recognize the sender and you are sure the 
content is safe. You will never be asked to reset your Altiostar password via 
email.


On Tue, May 11, 2021 at 02:16:04PM +, Amar Subramanyam via Linuxptp-devel 
wrote:

>> What happens here exactly is, due to the continuous triggering of 
>> BMCA, the value of mono_interval (the interval between two successive 
>> calls of clock_check_sample ()) gets increased and SYNCRONIZATION 
>> FAULT occurs with the default value of sanity_freq_limit (which is 
>> 2).

> Ah!  Now we are getting somewhere!

> This is a bug in the sanity check.  I've seen it, too.  Let's fix the bug.

>> Modifying the configuration with --sanity_freq_limit=0 will prevent 
>> the FAULT from occurring , but it will not address the root cause of 
>> BMCA getting triggered continuously even though there is no change in 
>> successive announce messages in the port.

> No, running the BMCA is not the root cause.  It is perfectly harmless to run 
> the BMCA with the same inputs and get the same result.  In fact, the 1588 
> standard specifies doing exactly this, over and over again.

>9.2.6.8 STATE_DECISION_EVENT

>...

>The STATE_DECISION_EVENT shall:

>-  Logically occur simultaneously on all ports of a clock
>-  Occur at least once per Announce message transmission interval

>IEEE Std 1588-2008, page 81

> The linuxptp implement does not follow this directive all the time, because 
> the BMCA has no time component.  If the inputs have not changed, then the 
> output remains the same.

> But you see that re-running the BMCA is harmless and not a bug.

We can look at fixing the issue with sanity freq limit. But no clue as why and 
how this sanity_freq_limit default value has been
Arrived at. Probably if we propose to disable this sanity freq limit by default 
or increasing this to much higher value as default might be 
one of the ways that can be thought through.

However we still need to address below issues:

(a). ptp4l keep logging below best master clock messages (selected best master 
clock 00..03) continuously flooding the logfile:
(b). The inactive LISTENING port continuously prints an error (master state 
recommended in slave only mode) which seems to be related to boundary clock but 
not for Ordinary Clock Subordinate/Slave.
Logs:
ptp4l[755639.589]: rms5 max7 freq   -877 +/-   6 delay 25476 
+/-   0
ptp4l[755639.818]: selected best master clock 00..03
ptp4l[755639.818]: updating UTC offset to 37
ptp4l[755639.818]: port 2: master state recommended in slave only mode
ptp4l[755639.818]: port 2: defaultDS.priority1 probably misconfigured
ptp4l[755639.990]: selected best master clock 00..03
ptp4l[755639.990]: updating UTC offset to 37
ptp4l[755640.193]: selected best master clock 00..03
ptp4l[755640.193]: updating UTC offset to 37
ptp4l[755640.193]: port 2: master state recommended in slave only mode
ptp4l[755640.193]: port 2: defaultDS.priority1 probably misconfigured
ptp4l[755640.385]: selected best master clock 00..03
ptp4l[755640.385]: updating UTC offset to 37
ptp4l[755640.568]: rms4 max8 freq   -877 +/-   6 delay 25476 
+/-   0

Should we rate limit log (a) as it will be printed whenever BMCA is triggered 
and avoid log (b) when boundary_clock_jbod=1 and clientOnly=1

> Thanks,
> Richard


___
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel


Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

2021-05-12 Thread Miroslav Lichvar
On Tue, May 11, 2021 at 02:16:04PM +, Amar Subramanyam wrote:
> What happens here exactly is, due to the continuous triggering of  BMCA, the 
> value of mono_interval (the interval between two successive calls of 
> clock_check_sample ()) gets increased and SYNCRONIZATION FAULT occurs with 
> the default value of sanity_freq_limit (which is 2). Modifying the 
> configuration with --sanity_freq_limit=0 will prevent the FAULT from 
> occurring , but it will not address the root cause of BMCA getting triggered 
> continuously even though there is no change in successive announce messages 
> in the port. So we believe that setting --sanity_freq_limit=0 is a work 
> around and doesn't directly solve the issue.
> Hence the changes we introduced are such that BMCA is not triggered at all 
> when there is no change in successive announce messages.

Ok, so it is the clock check as I suspected. I don't see how it is
related to the BMCA or the announce timeout. The clock check is active
when the clock is in a synchronized state and it checks RX timestamps
of event messages.

If the clocks were not synchronized, sync messages received on
different ports failed the check. That's what I saw in my test, even
with your patch applied.

There is a race condition with phc2sys. It may not be fast enough to
sync the other clock before it receives an event message.

I think the fix should be one of the following:
- disable clock check in jbod mode (it cannot work reliably as it is)
- limit the check to timestamps from the synchronized port
- have a separate clock check instance for each clock, checking only
  its own timestamps

> > There might be a better name for this function. Maybe something related to 
> > its purpose rather than what it does.
> 
> Is the name "clock_get_port_client_state" fine?. Could you please propose any 
> new suggestions?

Maybe something like clock_non_client_port_announce_timer would work
better?

> > Ok, but if this optimization is useful in the jbod mode, it should be 
> > useful even in the non-jbod mode, right? Most of the port code shouldn't 
> > care about jbod.
> 
> Yes, as you suggested this change is useful in both jbod and non-jbod mode to 
> avoid unnecessary triggering of BMCA. But there is no impact seen in non jbod 
> case as there is only one port. Whereas there is clear impact on the SLAVE 
> port in jbod, slaveOnly case, as explained earlier. We didn't want to 
> introduce any new variables to the non jbod mode, hence we restricted our 
> change to the jbod mode alone.

I think it could work as an optimization to avoid unnecessary calls of
BCMA and spam in the log, but that shouldn't be specific to the jbod
mode.

-- 
Miroslav Lichvar



___
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel


Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

2021-05-12 Thread Amar Subramanyam via Linuxptp-devel
Hi Ramana,

Please find our response inline. Let us know if any changes are required.

Thanks
Amar B S

-Original Message-
From: Richard Cochran [mailto:richardcoch...@gmail.com] 
Sent: 11 May 2021 21:01
To: Amar Subramanyam 
Cc: Miroslav Lichvar ; linuxptp-devel@lists.sourceforge.net
Subject: Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran 
with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

CAUTION: This email originated from outside of Altiostar. Do not click on links 
or open attachments unless you recognize the sender and you are sure the 
content is safe. You will never be asked to reset your Altiostar password via 
email.


On Tue, May 11, 2021 at 02:16:04PM +, Amar Subramanyam via Linuxptp-devel 
wrote:

>> What happens here exactly is, due to the continuous triggering of 
>> BMCA, the value of mono_interval (the interval between two successive 
>> calls of clock_check_sample ()) gets increased and SYNCRONIZATION 
>> FAULT occurs with the default value of sanity_freq_limit (which is 
>> 2).

Ah!  Now we are getting somewhere!

This is a bug in the sanity check.  I've seen it, too.  Let's fix the bug.

>> Modifying the configuration with --sanity_freq_limit=0 will prevent 
>> the FAULT from occurring , but it will not address the root cause of 
>> BMCA getting triggered continuously even though there is no change in 
>> successive announce messages in the port.

No, running the BMCA is not the root cause.  It is perfectly harmless to run 
the BMCA with the same inputs and get the same result.  In fact, the 1588 
standard specifies doing exactly this, over and over again.

9.2.6.8 STATE_DECISION_EVENT

...

The STATE_DECISION_EVENT shall:

  -  Logically occur simultaneously on all ports of a clock
  -  Occur at least once per Announce message transmission interval

IEEE Std 1588-2008, page 81

 The linuxptp implement does not follow this directive all the time, because 
the BMCA has no time component.  If the inputs have not changed, then the 
output remains the same.

 But you see that re-running the BMCA is harmless and not a bug.

> Point noted. 
> When we increase sanity_freq_limit to a higher value or disable, we are not 
> seeing the issue of flap between UNCALIBRATED and SLAVE state. How was this 
> default limit decided?.
> When BMCA is triggered continuously we still see below issues:
> 1. The log  "ptp4l[1205111.854]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[1205111.857]: selected best master clock 00..05" is 
printed continously.
> 2.  The log "port 1: master state recommended in slave only mode
 ptp4l[1205469.356]: port 1: defaultDS.priority1 probably 
misconfigured" is printed continuously.
> We need to rate limit the 1st log as it will be printed whenever BMCA is 
> triggered and we need to avoid log 2 when boundary_clock_jbod=1 and
> clientOnly=1


Thanks,
Richard


___
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel


Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

2021-05-11 Thread Richard Cochran
On Tue, May 11, 2021 at 02:16:04PM +, Amar Subramanyam via Linuxptp-devel 
wrote:

> What happens here exactly is, due to the continuous triggering of
> BMCA, the value of mono_interval (the interval between two
> successive calls of clock_check_sample ()) gets increased and
> SYNCRONIZATION FAULT occurs with the default value of
> sanity_freq_limit (which is 2).

Ah!  Now we are getting somewhere!

This is a bug in the sanity check.  I've seen it, too.  Let's fix the bug.

> Modifying the configuration with --sanity_freq_limit=0 will prevent
> the FAULT from occurring , but it will not address the root cause of
> BMCA getting triggered continuously even though there is no change
> in successive announce messages in the port.

No, running the BMCA is not the root cause.  It is perfectly harmless
to run the BMCA with the same inputs and get the same result.  In
fact, the 1588 standard specifies doing exactly this, over and over
again.

9.2.6.8 STATE_DECISION_EVENT

...

The STATE_DECISION_EVENT shall:

-  Logically occur simultaneously on all ports of a clock
-  Occur at least once per Announce message transmission interval

IEEE Std 1588-2008, page 81

The linuxptp implement does not follow this directive all the time,
because the BMCA has no time component.  If the inputs have not
changed, then the output remains the same.

But you see that re-running the BMCA is harmless and not a bug.

Thanks,
Richard


___
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel


Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

2021-05-11 Thread Ramana Reddy via Linuxptp-devel
One possible issue is you are setting clientOnly=1 for a BC; this is not 
allowed per IEEE1588 (as a BC shall not be a slaveOnly PTP Instance).

>From IEEE1588-2019:

9.2.3 Non-slaveOnly PTP Instances
A Boundary Clock shall not be a slaveOnly PTP Instance. Ordinary Clocks 
not designed or configured as slaveOnly and Boundary Clocks shall   
implement the state machine illustrated in Figure 30.
>>I second that. At sometime we need to define slave_clock_jbod for 
>>Ordinary_clock mode and avoid using boundary_clock_jbod with client_only
option for OC Client/Slave modes. We need to put in design and see to it that 
it addresses all the aspects.
I believe present change tries to fix the existing issue with 
boundary_clock_jbod with client_only config options.

Thanks,
Ramana

-Original Message-
From: Greg Armstrong [mailto:greg.armstrong...@renesas.com] 
Sent: 07 May 2021 19:30
To: Amar Subramanyam; Miroslav Lichvar
Cc: linuxptp-devel@lists.sourceforge.net
Subject: Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran 
with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

CAUTION: This email originated from outside of Altiostar. Do not click on links 
or open attachments unless you recognize the sender and you are sure the 
content is safe. You will never be asked to reset your Altiostar password via 
email.


One possible issue is you are setting clientOnly=1 for a BC; this is not 
allowed per IEEE1588 (as a BC shall not be a slaveOnly PTP Instance).

>From IEEE1588-2019:

9.2.3 Non-slaveOnly PTP Instances
A Boundary Clock shall not be a slaveOnly PTP Instance. Ordinary Clocks 
not designed or configured as slaveOnly and Boundary Clocks shall   
implement the state machine illustrated in Figure 30.

Greg

-Original Message-
From: Amar Subramanyam via Linuxptp-devel 
Sent: May 7, 2021 7:37 AM
To: Miroslav Lichvar 
Cc: linuxptp-devel@lists.sourceforge.net
Subject: Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran 
with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

Hi Miroslav,

Please find our response in line.

Thanks,
Amar B S

-Original Message-
From: Miroslav Lichvar [mailto:mlich...@redhat.com]
Sent: 06 May 2021 19:41
To: Amar Subramanyam 
Cc: linuxptp-devel@lists.sourceforge.net
Subject: Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran 
with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

CAUTION: This email originated from outside of Altiostar. Do not click on links 
or open attachments unless you recognize the sender and you are sure the 
content is safe. You will never be asked to reset your Altiostar password via 
email.


On Tue, May 04, 2021 at 01:51:24PM +0300, Amar Subramanyam via Linuxptp-devel 
wrote:
> This patch addresses the following issues when ptp4l is ran on
> multiple ports with jbod and client only mode (i.e clientOnly=1 and
> boundary_clock_jbod=1):-
>
> 1.SYNCHRONIZATION FAULT occurs at every ANNOUNCE RECEIPT Timeout on
> LISTENING port,  which leads to PTP port state of SLAVE port to flap
> between SLAVE and UNCALIBRATED  states continuously.

>> It's not clear to me what exactly is happening here and how does the patch 
>> fix it. The faults are happening due to the clock check getting out of order 
>> timestamps from two unsynchronized clocks, right? Any chance it is an issue 
>> with phc2sys not synchronizing the clocks?

Please find the attached diagram, which details our use case. Two active grand 
masters are connected to the same Telecom Slave Clock in two different ports, 
PORT1 and PORT2. Single instance of Ptp4l and phc2sys are running in the 
Telecom Slave Clock with boundary_clock_jbod=1 and clientOnly=1 configurations. 
The expected behaviour here is ptp4l will take into account GM1 and GM2 and 
choose the best master using BMCA algorithm. The port which has the best master 
will be in SLAVE state while the other port will remain in LISTENING state, 
acting as a redundant port.
Let  GM1 be a better master than GM2, resulting in Port1 to be in SLAVE state 
and Port2  in LISTENING state.  While running the latest ptp4l, we are 
observing the following issues, for which we have proposed our patch :-

Since Port 2 is in LISTENING state even though we are receving announce packets 
from GM2, the function add_foreign_master() is called for every Announce 
message received. Here, the announce receipt timer is not re-armed and hence 
will trigger BMCA for every 375ms (Default Announce reciept timeout). This is 
expected behaviour for normal client only mode with only 1 port but  when 
multiple client/slave ports are involved(say port1, port 2), we dont expect 
port 2 to trigger BMCA unless there is a change in the Announce message 
recevied from GM2.

Continuous BMCA triggering in port 2 causes a SYNCHRONIZATION FAULT in 
port1.This causes port1 to jump from SLAVE to UNCALIBRATED and vice versa 

Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

2021-05-11 Thread Amar Subramanyam via Linuxptp-devel
Hi Miroslav Lichvar,

Thanks for your review. Please find our comments inline.

Thanks,
Amar B S

-Original Message-
From: Miroslav Lichvar [mailto:mlich...@redhat.com] 
Sent: 10 May 2021 16:58
To: Amar Subramanyam 
Cc: linuxptp-devel@lists.sourceforge.net; Ramana Reddy ; 
Karthikkumar Valoor 
Subject: Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran 
with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

CAUTION: This email originated from outside of Altiostar. Do not click on links 
or open attachments unless you recognize the sender and you are sure the 
content is safe. You will never be asked to reset your Altiostar password via 
email.


On Fri, May 07, 2021 at 11:37:06AM +, Amar Subramanyam wrote:
>> Continuous BMCA triggering in port 2 causes a SYNCHRONIZATION FAULT in 
>> port1.This causes port1 to jump from SLAVE to UNCALIBRATED and vice versa 
>> repeatedly.

> I'm sorry if I sound like a broken record, but what exactly is the cause of 
> the fault? Is it the clock check seeing timestamps from two clocks that are 
> not synchronized? Do the faults disappear if you set --sanity_freq_limit=0?

What happens here exactly is, due to the continuous triggering of  BMCA, the 
value of mono_interval (the interval between two successive calls of 
clock_check_sample ()) gets increased and SYNCRONIZATION FAULT occurs with the 
default value of sanity_freq_limit (which is 2). Modifying the 
configuration with --sanity_freq_limit=0 will prevent the FAULT from occurring 
, but it will not address the root cause of BMCA getting triggered continuously 
even though there is no change in successive announce messages in the port. So 
we believe that setting --sanity_freq_limit=0 is a work around and doesn't 
directly solve the issue.
Hence the changes we introduced are such that BMCA is not triggered at all when 
there is no change in successive announce messages.

> That's the only fault I see in my test and in your original report.
> Your patch doesn't seem to prevent that fault in my test, so I'm confused.

With the changes in our patch, BMCA will not be triggered for LISTENING Port 
unless there is a change in successive announce messages. Thus mono_interval 
will not be increasing suddenly to a high value and even with the default value 
of sanity_freq_limit (2), ptp4l will function properly.

>> Noted. Please find the updated description and function name below, we will 
>> send out the modified patch after full review.
>>
>> + * Get port SLAVE state for client only mode.
>>  + * @param c  The clock instance.
>>  + * @return   Return 0 if any port is in SLAVE state, 1 otherwise.
>>  + */
>>  +int clock_get_port_slave_state(struct clock *c);

> There might be a better name for this function. Maybe something related to 
> its purpose rather than what it does.

Is the name "clock_get_port_client_state" fine?. Could you please propose any 
new suggestions?

>> Hence, we are clearing the Announce receipt timer for port2 (LISTENING port) 
>> in the function bc_event() upon Announce receipt timeout, only when 
>> boundary_clock_jbod=1 and clientOnly=1 is configured and atleast one port 
>> (Port1 here) is in SLAVE/UNCALIBRATED state.

> Ok, but if this optimization is useful in the jbod mode, it should be useful 
> even in the non-jbod mode, right? Most of the port code shouldn't care about 
> jbod.

Yes, as you suggested this change is useful in both jbod and non-jbod mode to 
avoid unnecessary triggering of BMCA. But there is no impact seen in non jbod 
case as there is only one port. Whereas there is clear impact on the SLAVE port 
in jbod, slaveOnly case, as explained earlier. We didn't want to introduce any 
new variables to the non jbod mode, hence we restricted our change to the jbod 
mode alone.

--
Miroslav Lichvar



___
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel


Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

2021-05-10 Thread Greg Armstrong
It may not break anything at the protocol, but I suspect the state decision 
algorithm is what's breaking. By setting clientOnly, you don't allow any PTP 
port to enter the MASTER (or PASSIVE) state (see IEEE1588-2019 Figure 31 for 
slave-only state machine). However, a BC would normally put any PTP port that 
is not SLAVE/UNCALIBRATED into the MASTER/PRE_MASTER or PASSIVE state - these 
are not allowed in slave-only.

I suspect this is why LISTENING is used by the 2nd PTP port (when the 1st PTP 
port is in the SLAVE state). However, once the Announce is received by the 2nd 
GM, the local PTP port needed to change to UNCALIBRATED state. From this state, 
the only allowed transition is to SLAVE state - and this is likely what breaks 
as there is already a port in the SLAVE state.

I have not dwelled into ptp4l state machine, but suspect this causes all the 
PTP ports to enter FAULTY state. This would explain what was observed of 
toggling in/out of FAULTY state.

Greg

-Original Message-
From: Miroslav Lichvar  
Sent: May 10, 2021 7:58 AM
To: Greg Armstrong 
Cc: Amar Subramanyam ; 
linuxptp-devel@lists.sourceforge.net
Subject: Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran 
with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

On Fri, May 07, 2021 at 02:27:46PM +, Greg Armstrong wrote:
> Just to add, the key reason it is not supported by BC is that if true, then 
> clockClass must be 255. This clockClass is only for slave-only OC.
> 
> From IEEE1588-2019 clause 8.2.1.3.1.2:
>   If defaultDS.slaveOnly is TRUE, the initialization value [of 
> defaultDS.clockQuality.clockClass] shall be 255 as specified in 7.6.2.5.
> 
> From IEEE1588-2019 Table 4:
>   255 |   Shall be the clockClass of a slave-only PTP Instance 
> (see 9.2.2.1).
> 
> And clause 9.2.2.1 title is "Slave-Only Ordinary Clocks".

Good point. 

But this doesn't break anything at the protocol level, right?
A slaveOnly clock should never send an annoucement message with its clockClass.

It is an extension that we can support in linuxptp, assuming we can make it 
work as expected, e.g. avoid those "priority1 probably misconfigured" warnings, 
etc.

--
Miroslav Lichvar



___
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel


Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

2021-05-10 Thread Miroslav Lichvar
On Fri, May 07, 2021 at 02:27:46PM +, Greg Armstrong wrote:
> Just to add, the key reason it is not supported by BC is that if true, then 
> clockClass must be 255. This clockClass is only for slave-only OC.
> 
> From IEEE1588-2019 clause 8.2.1.3.1.2:
>   If defaultDS.slaveOnly is TRUE, the initialization value [of 
> defaultDS.clockQuality.clockClass] shall be 255 as specified in 7.6.2.5.
> 
> From IEEE1588-2019 Table 4:
>   255 |   Shall be the clockClass of a slave-only PTP Instance 
> (see 9.2.2.1).
> 
> And clause 9.2.2.1 title is "Slave-Only Ordinary Clocks".

Good point. 

But this doesn't break anything at the protocol level, right?
A slaveOnly clock should never send an annoucement message with its
clockClass.

It is an extension that we can support in linuxptp, assuming we can
make it work as expected, e.g. avoid those "priority1 probably
misconfigured" warnings, etc.

-- 
Miroslav Lichvar



___
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel


Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

2021-05-10 Thread Miroslav Lichvar
On Fri, May 07, 2021 at 11:37:06AM +, Amar Subramanyam wrote:
> Continuous BMCA triggering in port 2 causes a SYNCHRONIZATION FAULT in 
> port1.This causes port1 to jump from SLAVE to UNCALIBRATED and vice versa 
> repeatedly. 

I'm sorry if I sound like a broken record, but what exactly is the
cause of the fault? Is it the clock check seeing timestamps from two
clocks that are not synchronized? Do the faults disappear if you set
--sanity_freq_limit=0?

That's the only fault I see in my test and in your original report.
Your patch doesn't seem to prevent that fault in my test, so I'm
confused.

> Noted. Please find the updated description and function name below, we will 
> send out the modified patch after full review. 
> 
>  + * Get port SLAVE state for client only mode.
>  + * @param c  The clock instance.
>  + * @return   Return 0 if any port is in SLAVE state, 1 otherwise.
>  + */
>  +int clock_get_port_slave_state(struct clock *c);

There might be a better name for this function. Maybe something
related to its purpose rather than what it does.

> Hence, we are clearing the Announce receipt timer for port2 (LISTENING port) 
> in the function bc_event() upon Announce receipt timeout, only when 
> boundary_clock_jbod=1 and clientOnly=1 is configured and atleast one port 
> (Port1 here) is in SLAVE/UNCALIBRATED state.

Ok, but if this optimization is useful in the jbod mode, it should be
useful even in the non-jbod mode, right? Most of the port code
shouldn't care about jbod.

-- 
Miroslav Lichvar



___
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel


Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

2021-05-07 Thread Greg Armstrong
Just to add, the key reason it is not supported by BC is that if true, then 
clockClass must be 255. This clockClass is only for slave-only OC.

>From IEEE1588-2019 clause 8.2.1.3.1.2:
If defaultDS.slaveOnly is TRUE, the initialization value [of 
defaultDS.clockQuality.clockClass] shall be 255 as specified in 7.6.2.5.

>From IEEE1588-2019 Table 4:
255 |   Shall be the clockClass of a slave-only PTP Instance 
(see 9.2.2.1).

And clause 9.2.2.1 title is "Slave-Only Ordinary Clocks".

Greg

-Original Message-
From: Greg Armstrong 
Sent: May 7, 2021 10:00 AM
To: Amar Subramanyam ; Miroslav Lichvar 

Cc: linuxptp-devel@lists.sourceforge.net
Subject: RE: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran 
with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

One possible issue is you are setting clientOnly=1 for a BC; this is not 
allowed per IEEE1588 (as a BC shall not be a slaveOnly PTP Instance).

>From IEEE1588-2019:

9.2.3 Non-slaveOnly PTP Instances
A Boundary Clock shall not be a slaveOnly PTP Instance. Ordinary Clocks 
not designed or configured as slaveOnly and Boundary Clocks shall   
implement the state machine illustrated in Figure 30.

Greg

-Original Message-
From: Amar Subramanyam via Linuxptp-devel 
Sent: May 7, 2021 7:37 AM
To: Miroslav Lichvar 
Cc: linuxptp-devel@lists.sourceforge.net
Subject: Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran 
with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

Hi Miroslav,

Please find our response in line.

Thanks,
Amar B S

-Original Message-
From: Miroslav Lichvar [mailto:mlich...@redhat.com]
Sent: 06 May 2021 19:41
To: Amar Subramanyam 
Cc: linuxptp-devel@lists.sourceforge.net
Subject: Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran 
with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

CAUTION: This email originated from outside of Altiostar. Do not click on links 
or open attachments unless you recognize the sender and you are sure the 
content is safe. You will never be asked to reset your Altiostar password via 
email.


On Tue, May 04, 2021 at 01:51:24PM +0300, Amar Subramanyam via Linuxptp-devel 
wrote:
> This patch addresses the following issues when ptp4l is ran on 
> multiple ports with jbod and client only mode (i.e clientOnly=1 and
> boundary_clock_jbod=1):-
>
> 1.SYNCHRONIZATION FAULT occurs at every ANNOUNCE RECEIPT Timeout on 
> LISTENING port,  which leads to PTP port state of SLAVE port to flap 
> between SLAVE and UNCALIBRATED  states continuously.

>> It's not clear to me what exactly is happening here and how does the patch 
>> fix it. The faults are happening due to the clock check getting out of order 
>> timestamps from two unsynchronized clocks, right? Any chance it is an issue 
>> with phc2sys not synchronizing the clocks?

Please find the attached diagram, which details our use case. Two active grand 
masters are connected to the same Telecom Slave Clock in two different ports, 
PORT1 and PORT2. Single instance of Ptp4l and phc2sys are running in the 
Telecom Slave Clock with boundary_clock_jbod=1 and clientOnly=1 configurations. 
The expected behaviour here is ptp4l will take into account GM1 and GM2 and 
choose the best master using BMCA algorithm. The port which has the best master 
will be in SLAVE state while the other port will remain in LISTENING state, 
acting as a redundant port. 
Let  GM1 be a better master than GM2, resulting in Port1 to be in SLAVE state 
and Port2  in LISTENING state.  While running the latest ptp4l, we are 
observing the following issues, for which we have proposed our patch :- 

Since Port 2 is in LISTENING state even though we are receving announce packets 
from GM2, the function add_foreign_master() is called for every Announce 
message received. Here, the announce receipt timer is not re-armed and hence 
will trigger BMCA for every 375ms (Default Announce reciept timeout). This is 
expected behaviour for normal client only mode with only 1 port but  when 
multiple client/slave ports are involved(say port1, port 2), we dont expect 
port 2 to trigger BMCA unless there is a change in the Announce message 
recevied from GM2. 

Continuous BMCA triggering in port 2 causes a SYNCHRONIZATION FAULT in 
port1.This causes port1 to jump from SLAVE to UNCALIBRATED and vice versa 
repeatedly. 
Hence, we are re-arming the Announce receipt timer in the function 
add_foreign_master(), only when boundary_clock_jbod=1 and clientOnly=1 is 
configured and atleast one port (Port1 here) is in SLAVE/UNCALIBRATED state.

> +int clock_get_client_state(struct clock *c) +{
> + struct port *piter;
> +
> + if (!clock_slave_only(c)) {
> + return 1;
> + }
> +
> + LIST_FOREACH(piter, >ports, list) {
> + enum port_state ps = port_state(piter);
> + if (ps == PS_SLAVE || ps == PS_UNCALIBRATED) {
> +

Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

2021-05-07 Thread Greg Armstrong
One possible issue is you are setting clientOnly=1 for a BC; this is not 
allowed per IEEE1588 (as a BC shall not be a slaveOnly PTP Instance).

>From IEEE1588-2019:

9.2.3 Non-slaveOnly PTP Instances
A Boundary Clock shall not be a slaveOnly PTP Instance. Ordinary Clocks 
not designed or configured as slaveOnly and Boundary Clocks shall   
implement the state machine illustrated in Figure 30.

Greg

-Original Message-
From: Amar Subramanyam via Linuxptp-devel 
 
Sent: May 7, 2021 7:37 AM
To: Miroslav Lichvar 
Cc: linuxptp-devel@lists.sourceforge.net
Subject: Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran 
with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

Hi Miroslav,

Please find our response in line.

Thanks,
Amar B S

-Original Message-
From: Miroslav Lichvar [mailto:mlich...@redhat.com]
Sent: 06 May 2021 19:41
To: Amar Subramanyam 
Cc: linuxptp-devel@lists.sourceforge.net
Subject: Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran 
with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

CAUTION: This email originated from outside of Altiostar. Do not click on links 
or open attachments unless you recognize the sender and you are sure the 
content is safe. You will never be asked to reset your Altiostar password via 
email.


On Tue, May 04, 2021 at 01:51:24PM +0300, Amar Subramanyam via Linuxptp-devel 
wrote:
> This patch addresses the following issues when ptp4l is ran on 
> multiple ports with jbod and client only mode (i.e clientOnly=1 and
> boundary_clock_jbod=1):-
>
> 1.SYNCHRONIZATION FAULT occurs at every ANNOUNCE RECEIPT Timeout on 
> LISTENING port,  which leads to PTP port state of SLAVE port to flap 
> between SLAVE and UNCALIBRATED  states continuously.

>> It's not clear to me what exactly is happening here and how does the patch 
>> fix it. The faults are happening due to the clock check getting out of order 
>> timestamps from two unsynchronized clocks, right? Any chance it is an issue 
>> with phc2sys not synchronizing the clocks?

Please find the attached diagram, which details our use case. Two active grand 
masters are connected to the same Telecom Slave Clock in two different ports, 
PORT1 and PORT2. Single instance of Ptp4l and phc2sys are running in the 
Telecom Slave Clock with boundary_clock_jbod=1 and clientOnly=1 configurations. 
The expected behaviour here is ptp4l will take into account GM1 and GM2 and 
choose the best master using BMCA algorithm. The port which has the best master 
will be in SLAVE state while the other port will remain in LISTENING state, 
acting as a redundant port. 
Let  GM1 be a better master than GM2, resulting in Port1 to be in SLAVE state 
and Port2  in LISTENING state.  While running the latest ptp4l, we are 
observing the following issues, for which we have proposed our patch :- 

Since Port 2 is in LISTENING state even though we are receving announce packets 
from GM2, the function add_foreign_master() is called for every Announce 
message received. Here, the announce receipt timer is not re-armed and hence 
will trigger BMCA for every 375ms (Default Announce reciept timeout). This is 
expected behaviour for normal client only mode with only 1 port but  when 
multiple client/slave ports are involved(say port1, port 2), we dont expect 
port 2 to trigger BMCA unless there is a change in the Announce message 
recevied from GM2. 

Continuous BMCA triggering in port 2 causes a SYNCHRONIZATION FAULT in 
port1.This causes port1 to jump from SLAVE to UNCALIBRATED and vice versa 
repeatedly. 
Hence, we are re-arming the Announce receipt timer in the function 
add_foreign_master(), only when boundary_clock_jbod=1 and clientOnly=1 is 
configured and atleast one port (Port1 here) is in SLAVE/UNCALIBRATED state.

> +int clock_get_client_state(struct clock *c) +{
> + struct port *piter;
> +
> + if (!clock_slave_only(c)) {
> + return 1;
> + }
> +
> + LIST_FOREACH(piter, >ports, list) {
> + enum port_state ps = port_state(piter);
> + if (ps == PS_SLAVE || ps == PS_UNCALIBRATED) {
> + return 0;
> + }
> + }
> + return 1;
> +}

> + * Inform if any of the port is in SLAVE state.
> + * @param c  The clock instance.
> + * @return   Return 0 if any port is in SLAVE state, 1 otherwise.
> + */
> +int clock_get_client_state(struct clock *c);

>> The function seems to be specific to the slave-only mode and it checks for 
>> two port states. The description and name of the function indicate something 
>> else.

Noted. Please find the updated description and function name below, we will 
send out the modified patch after full review. 

 + * Get port SLAVE state for client only mode.
 + * @param c  The clock instance.
 + * @return   Return 0 if any port is in SLAVE state, 1 otherwise.
 + */
 +int clock_get_port_slave_state(struct clock *c);

> +  * In multiport slave 

Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

2021-05-07 Thread Amar Subramanyam via Linuxptp-devel
Hi Miroslav,

Please find our response in line.

Thanks,
Amar B S

-Original Message-
From: Miroslav Lichvar [mailto:mlich...@redhat.com] 
Sent: 06 May 2021 19:41
To: Amar Subramanyam 
Cc: linuxptp-devel@lists.sourceforge.net
Subject: Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran 
with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

CAUTION: This email originated from outside of Altiostar. Do not click on links 
or open attachments unless you recognize the sender and you are sure the 
content is safe. You will never be asked to reset your Altiostar password via 
email.


On Tue, May 04, 2021 at 01:51:24PM +0300, Amar Subramanyam via Linuxptp-devel 
wrote:
> This patch addresses the following issues when ptp4l is ran on 
> multiple ports with jbod and client only mode (i.e clientOnly=1 and 
> boundary_clock_jbod=1):-
>
> 1.SYNCHRONIZATION FAULT occurs at every ANNOUNCE RECEIPT Timeout on 
> LISTENING port,  which leads to PTP port state of SLAVE port to flap 
> between SLAVE and UNCALIBRATED  states continuously.

>> It's not clear to me what exactly is happening here and how does the patch 
>> fix it. The faults are happening due to the clock check getting out of order 
>> timestamps from two unsynchronized clocks, right? Any chance it is an issue 
>> with phc2sys not synchronizing the clocks?

Please find the attached diagram, which details our use case. Two active grand 
masters are connected to the same Telecom Slave Clock in two different ports, 
PORT1 and PORT2. Single instance of Ptp4l and phc2sys are running in the 
Telecom Slave Clock with boundary_clock_jbod=1 and clientOnly=1 configurations. 
The expected behaviour here is ptp4l will take into account GM1 and GM2 and 
choose the best master using BMCA algorithm. The port which has the best master 
will be in SLAVE state while the other port will remain in LISTENING state, 
acting as a redundant port. 
Let  GM1 be a better master than GM2, resulting in Port1 to be in SLAVE state 
and Port2  in LISTENING state.  While running the latest ptp4l, we are 
observing the following issues, for which we have proposed our patch :- 

Since Port 2 is in LISTENING state even though we are receving announce packets 
from GM2, the function add_foreign_master() is called for every Announce 
message received. Here, the announce receipt timer is not re-armed and hence 
will trigger BMCA for every 375ms (Default Announce reciept timeout). This is 
expected behaviour for normal client only mode with only 1 port but  when 
multiple client/slave ports are involved(say port1, port 2), we dont expect 
port 2 to trigger BMCA unless there is a change in the Announce message 
recevied from GM2. 

Continuous BMCA triggering in port 2 causes a SYNCHRONIZATION FAULT in 
port1.This causes port1 to jump from SLAVE to UNCALIBRATED and vice versa 
repeatedly. 
Hence, we are re-arming the Announce receipt timer in the function 
add_foreign_master(), only when boundary_clock_jbod=1 and clientOnly=1 is 
configured and atleast one port (Port1 here) is in SLAVE/UNCALIBRATED state.

> +int clock_get_client_state(struct clock *c) +{
> + struct port *piter;
> +
> + if (!clock_slave_only(c)) {
> + return 1;
> + }
> +
> + LIST_FOREACH(piter, >ports, list) {
> + enum port_state ps = port_state(piter);
> + if (ps == PS_SLAVE || ps == PS_UNCALIBRATED) {
> + return 0;
> + }
> + }
> + return 1;
> +}

> + * Inform if any of the port is in SLAVE state.
> + * @param c  The clock instance.
> + * @return   Return 0 if any port is in SLAVE state, 1 otherwise.
> + */
> +int clock_get_client_state(struct clock *c);

>> The function seems to be specific to the slave-only mode and it checks for 
>> two port states. The description and name of the function indicate something 
>> else.

Noted. Please find the updated description and function name below, we will 
send out the modified patch after full review. 

 + * Get port SLAVE state for client only mode.
 + * @param c  The clock instance.
 + * @return   Return 0 if any port is in SLAVE state, 1 otherwise.
 + */
 +int clock_get_port_slave_state(struct clock *c);

> +  * In multiport slave only mode, there maybe
> +  * announce messages on LISTENING port. Re-arm
> +  * the timer if any other configured port is in SLAVE state
> +  */
> + if (p->jbod && !clock_get_client_state(p->clock)) {
> + port_set_announce_tmo(p);
> + }

>> Why is this and the other part of the port change specific to the jbod mode? 
>> Shouldn't the announce timer work exactly the same no matter if it's a jbod 
>> or a real multiport clock?

Please find the attached diagram, which details our use case. Two active grand 
masters are connected to the same Telecom Slave Clock in two different ports, 
PORT1 and PORT2. Single instance of Ptp4l and phc2sys are running in the 
Telecom Slave Clock with 

Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

2021-05-06 Thread Miroslav Lichvar
On Tue, May 04, 2021 at 01:51:24PM +0300, Amar Subramanyam via Linuxptp-devel 
wrote:
> This patch addresses the following issues when ptp4l is ran on multiple ports
> with jbod and client only mode (i.e clientOnly=1 and boundary_clock_jbod=1):-
> 
> 1.SYNCHRONIZATION FAULT occurs at every ANNOUNCE RECEIPT Timeout on LISTENING 
> port,
>  which leads to PTP port state of SLAVE port to flap between SLAVE and 
> UNCALIBRATED
>  states continuously.

It's not clear to me what exactly is happening here and how does the
patch fix it. The faults are happening due to the clock check getting
out of order timestamps from two unsynchronized clocks, right? Any
chance it is an issue with phc2sys not synchronizing the clocks?

> +int clock_get_client_state(struct clock *c) +{
> + struct port *piter;
> +
> + if (!clock_slave_only(c)) {
> + return 1;
> + }
> +
> + LIST_FOREACH(piter, >ports, list) {
> + enum port_state ps = port_state(piter);
> + if (ps == PS_SLAVE || ps == PS_UNCALIBRATED) {
> + return 0;
> + }
> + }
> + return 1;
> +}

> + * Inform if any of the port is in SLAVE state.
> + * @param c  The clock instance.
> + * @return   Return 0 if any port is in SLAVE state, 1 otherwise.
> + */
> +int clock_get_client_state(struct clock *c);

The function seems to be specific to the slave-only mode and it checks
for two port states. The description and name of the function indicate
something else.

> +  * In multiport slave only mode, there maybe
> +  * announce messages on LISTENING port. Re-arm
> +  * the timer if any other configured port is in SLAVE state
> +  */
> + if (p->jbod && !clock_get_client_state(p->clock)) {
> + port_set_announce_tmo(p);
> + }

Why is this and the other part of the port change specific to the jbod
mode? Shouldn't the announce timer work exactly the same no matter if
it's a jbod or a real multiport clock?

-- 
Miroslav Lichvar



___
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel


Re: [Linuxptp-devel] [PATCH] Sync issues observed when ptp4l is ran with jbod and client only mode (clientOnly=1 and boundary_clock_jbod=1)

2021-05-04 Thread Amar Subramanyam via Linuxptp-devel
Hi,

The commands we used for testing below issues are:
For 8275.1 profile:
ptp4l -f  config_ptp_8275_1.conf -i sriov1 -i sriov0 -H -m 2 --boundary_clock=1 
--slaveOnly=1
phc2sys -a -r  -m -R 16 -n 24

For 8275.2 profile: 
ptp4l -f  config_ptp_8275_2.conf -i sriov1 -i sriov0 -H -m 4 --boundary_clock=1 
--slaveOnly=1
phc2sys -a -r  -m -R 16 -n 44

Thanks,
Amar B S

-Original Message-
From: Amar Subramanyam 
Sent: 04 May 2021 16:21
To: linuxptp-devel@lists.sourceforge.net
Cc: Amar Subramanyam ; Karthikkumar Valoor 
; Ramana Reddy 
Subject: [PATCH] Sync issues observed when ptp4l is ran with jbod and client 
only mode (clientOnly=1 and boundary_clock_jbod=1)

This patch addresses the following issues when ptp4l is ran on multiple ports 
with jbod and client only mode (i.e clientOnly=1 and boundary_clock_jbod=1):-

1.SYNCHRONIZATION FAULT occurs at every ANNOUNCE RECEIPT Timeout on LISTENING 
port,  which leads to PTP port state of SLAVE port to flap between SLAVE and 
UNCALIBRATED  states continuously.
2.When both ports are receiving announce messages, the port other than SLAVE 
port  is always in LISTENING state, this results in BMCA algorithm being 
triggered at  every ANNOUNCE RECEIPT Timeout even though there is no change in 
successive announce messages.
3.The port other than SLAVE (LISTENING port) prints an error  "port 1: master 
state recommended in slave only mode
 ptp4l[1205469.356]: port 1: defaultDS.priority1 probably misconfigured"
 for every ANNOUNCE RECEIPT Timeout.
4.When the port other than the SLAVE port Stops receiving announce packets, 
BMCA is triggered  at every ANNOUNCE RECEIPT Timeout indefinitely.

Signed-off-by: Amar Subramanyam 
Signed-off-by: Karthikkumar Valoor 
Signed-off-by: Ramana Reddy 
---
 clock.c | 17 +
 clock.h |  7 +++
 port.c  | 16 
 3 files changed, 40 insertions(+)

diff --git a/clock.c b/clock.c
index e545a9b..aedba6d 100644
--- a/clock.c
+++ b/clock.c
@@ -1870,6 +1870,23 @@ enum servo_state clock_synchronize(struct clock *c, 
tmv_t ingress, tmv_t origin)
return state;
 }
 
+int clock_get_client_state(struct clock *c) {
+   struct port *piter;
+
+   if (!clock_slave_only(c)) {
+   return 1;
+   }
+
+   LIST_FOREACH(piter, >ports, list) {
+   enum port_state ps = port_state(piter);
+   if (ps == PS_SLAVE || ps == PS_UNCALIBRATED) {
+   return 0;
+   }
+   }
+   return 1;
+}
+
 void clock_sync_interval(struct clock *c, int n)  {
int shift;
diff --git a/clock.h b/clock.h
index 845d54f..4779ec9 100644
--- a/clock.h
+++ b/clock.h
@@ -326,6 +326,13 @@ enum servo_state clock_synchronize(struct clock *c, tmv_t 
ingress,
   tmv_t origin);
 
 /**
+ * Inform if any of the port is in SLAVE state.
+ * @param c  The clock instance.
+ * @return   Return 0 if any port is in SLAVE state, 1 otherwise.
+ */
+int clock_get_client_state(struct clock *c);
+
+/**
  * Inform a slaved clock about the master's sync interval.
  * @param c  The clock instance.
  * @param n  The logarithm base two of the sync interval.
diff --git a/port.c b/port.c
index 10bb9e1..7d10bb8 100644
--- a/port.c
+++ b/port.c
@@ -390,6 +390,15 @@ static int add_foreign_master(struct port *p, struct 
ptp_message *m)
diff = announce_compare(m, tmp);
}
 
+   /*
+* In multiport slave only mode, there maybe
+* announce messages on LISTENING port. Re-arm
+* the timer if any other configured port is in SLAVE state
+*/
+   if (p->jbod && !clock_get_client_state(p->clock)) {
+   port_set_announce_tmo(p);
+   }
+
return broke_threshold || diff;
 }
 
@@ -2654,6 +2663,13 @@ static enum fsm_event bc_event(struct port *p, int 
fd_index)
port_set_announce_tmo(p);
}
 
+   /*
+* As one of the port is in SLAVE state stop retriggering BMCA
+*/
+   if (p->jbod && !clock_get_client_state(p->clock)) {
+   port_clr_tmo(p->fda.fd[FD_ANNOUNCE_TIMER]);
+   }
+
delay_req_prune(p);
if (clock_slave_only(p->clock) && p->delayMechanism != DM_P2P &&
port_renew_transport(p)) {
--
1.8.3.1



___
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel