Hi Miroslav, Please find our response inline.
Thanks, Amar B S > -----Original Message----- > From: Miroslav Lichvar [mailto:mlich...@redhat.com] > Sent: 27 May 2021 15:14 > To: Amar Subramanyam <asubraman...@altiostar.com> > Cc: linuxptp-devel@lists.sourceforge.net > Subject: Re: [PATCH v2 3/3] port: Don't check timestamps from non-slave ports. > > CAUTION: This email originated from outside of Altiostar. Do not click on > links > or open attachments unless you recognize the sender and you are sure the > content is safe. You will never be asked to reset your Altiostar password via > email. > > > On Wed, May 26, 2021 at 05:45:29PM +0000, Amar Subramanyam wrote: > > Hi Miroslav, > > > > We were able to reproduce the issue even without running phc2sys. > > Please find the attached strace and ptp4l logs when this issue is seen. > > Ok, thanks. That's very helpful. > > > From the ptp4l log, we can see that BMCA took 148 msec to run(Mono Interval > :: 148445967). > > The same can be observed from strace logs. In the attached strace log, BMCA > is executed between the timestamps 00:07:27.047830 (line number 53) to > 00:07:27.196275(line number 102). > > I don't see that in the log. > > > 00:07:27.142942 close(4) = 0 > > 00:07:27.153733 close(15) = 0 > > 00:07:27.167783 socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP) = 4 > > 00:07:27.167829 ioctl(4, SIOCGIFHWADDR, {ifr_name="sriov0", > ifr_hwaddr={sa_family=ARPHRD_ETHER, sa_data=64:4c:36:12:55:e0}}) = 0 > > 00:07:27.167878 close(4) = 0 > > 00:07:27.167964 socket(AF_PACKET, SOCK_RAW, 768) = 4 > > 00:07:27.168023 ioctl(4, SIOCGIFINDEX, {ifr_name="sriov0", }) = 0 > > 00:07:27.168095 bind(4, {sa_family=AF_PACKET, > > sll_protocol=htons(ETH_P_ALL), sll_ifindex=if_nametoindex("sriov0"), > > sll_hatype=ARPHRD_NETROM, sll_pkttype=PACKET_HOST, sll_halen=0}, 20) = > > 0 > > 00:07:27.178781 setsockopt(4, SOL_SOCKET, SO_BINDTODEVICE, "sriov0", > > 6) = 0 > > 00:07:27.178830 setsockopt(4, SOL_SOCKET, SO_ATTACH_FILTER, {len=12, > > filter=0x635ae0}, 16) = 0 > > 00:07:27.178875 setsockopt(4, SOL_PACKET, PACKET_ADD_MEMBERSHIP, > > {mr_ifindex=if_nametoindex("sriov0"), mr_type=PACKET_MR_MULTICAST, > > mr_alen=6, mr_address=011b19000000}, 16) = 0 > > 00:07:27.179151 setsockopt(4, SOL_PACKET, PACKET_ADD_MEMBERSHIP, > > {mr_ifindex=if_nametoindex("sriov0"), mr_type=PACKET_MR_MULTICAST, > > mr_alen=6, mr_address=0180c200000e}, 16) = 0 > > 00:07:27.179415 socket(AF_PACKET, SOCK_RAW, 768) = 15 > > 00:07:27.179482 ioctl(15, SIOCGIFINDEX, {ifr_name="sriov0", }) = 0 > > 00:07:27.179518 bind(15, {sa_family=AF_PACKET, > > sll_protocol=htons(ETH_P_ALL), sll_ifindex=if_nametoindex("sriov0"), > > sll_hatype=ARPHRD_NETROM, sll_pkttype=PACKET_HOST, sll_halen=0}, 20) = > > 0 > > 00:07:27.193807 setsockopt(15, SOL_SOCKET, SO_BINDTODEVICE, "sriov0", > > 6) = 0 > > What I see is that it's the closing and binding of the raw sockets that is > slowing > down ptp4l so much that a received message waits for up to ~40 milliseconds > before the clock check gets its timestamp. > > ptp4l is renewing the transport on announce/sync timeout, which according to > the git log is needed in the client-only mode to not get multicast sockets > stuck > when the link goes down. I think the fix here is to avoid renewing the raw > transport. It shouldn't be necessary if I understand it correctly. > > Can you please verify that the following change fixes the problem for you? > > --- a/port.c > +++ b/port.c > @@ -1806,6 +1806,12 @@ static int port_renew_transport(struct port *p) > if (!port_is_enabled(p)) { > return 0; > } > + > + /* Closing and binding of raw sockets is too slow and unnecessary */ > + if (transport_type(p->trp) == TRANS_IEEE_802_3) { > + return 0; > + } > + > transport_close(p->trp, &p->fda); > port_clear_fda(p, FD_FIRST_TIMER); > res = transport_open(p->trp, p->iface, &p->fda, p->timestamping); > Yes, With this patch applied, the problem seems to be fixed. We even did a cable removal test and link status was correctly detected. We could see that ptp4l already has RTNL functionality and hence this port_renew_transport() seems to be unnecessary redundancy. Should bypassing the function entirely not just for 802_3 but also for UDPv4 and UDPv6 transports be the right solution? Let us know your comments. > -- > Miroslav Lichvar _______________________________________________ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel