On Wed, May 26, 2021 at 05:45:29PM +0000, Amar Subramanyam wrote: > Hi Miroslav, > > We were able to reproduce the issue even without running phc2sys. > Please find the attached strace and ptp4l logs when this issue is seen.
Ok, thanks. That's very helpful. > From the ptp4l log, we can see that BMCA took 148 msec to run(Mono Interval > :: 148445967). > The same can be observed from strace logs. In the attached strace log, BMCA > is executed between the timestamps 00:07:27.047830 (line number 53) to > 00:07:27.196275(line number 102). I don't see that in the log. > 00:07:27.142942 close(4) = 0 > 00:07:27.153733 close(15) = 0 > 00:07:27.167783 socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP) = 4 > 00:07:27.167829 ioctl(4, SIOCGIFHWADDR, {ifr_name="sriov0", > ifr_hwaddr={sa_family=ARPHRD_ETHER, sa_data=64:4c:36:12:55:e0}}) = 0 > 00:07:27.167878 close(4) = 0 > 00:07:27.167964 socket(AF_PACKET, SOCK_RAW, 768) = 4 > 00:07:27.168023 ioctl(4, SIOCGIFINDEX, {ifr_name="sriov0", }) = 0 > 00:07:27.168095 bind(4, {sa_family=AF_PACKET, sll_protocol=htons(ETH_P_ALL), > sll_ifindex=if_nametoindex("sriov0"), sll_hatype=ARPHRD_NETROM, > sll_pkttype=PACKET_HOST, sll_halen=0}, 20) = 0 > 00:07:27.178781 setsockopt(4, SOL_SOCKET, SO_BINDTODEVICE, "sriov0", 6) = 0 > 00:07:27.178830 setsockopt(4, SOL_SOCKET, SO_ATTACH_FILTER, {len=12, > filter=0x635ae0}, 16) = 0 > 00:07:27.178875 setsockopt(4, SOL_PACKET, PACKET_ADD_MEMBERSHIP, > {mr_ifindex=if_nametoindex("sriov0"), mr_type=PACKET_MR_MULTICAST, mr_alen=6, > mr_address=011b19000000}, 16) = 0 > 00:07:27.179151 setsockopt(4, SOL_PACKET, PACKET_ADD_MEMBERSHIP, > {mr_ifindex=if_nametoindex("sriov0"), mr_type=PACKET_MR_MULTICAST, mr_alen=6, > mr_address=0180c200000e}, 16) = 0 > 00:07:27.179415 socket(AF_PACKET, SOCK_RAW, 768) = 15 > 00:07:27.179482 ioctl(15, SIOCGIFINDEX, {ifr_name="sriov0", }) = 0 > 00:07:27.179518 bind(15, {sa_family=AF_PACKET, sll_protocol=htons(ETH_P_ALL), > sll_ifindex=if_nametoindex("sriov0"), sll_hatype=ARPHRD_NETROM, > sll_pkttype=PACKET_HOST, sll_halen=0}, 20) = 0 > 00:07:27.193807 setsockopt(15, SOL_SOCKET, SO_BINDTODEVICE, "sriov0", 6) = 0 What I see is that it's the closing and binding of the raw sockets that is slowing down ptp4l so much that a received message waits for up to ~40 milliseconds before the clock check gets its timestamp. ptp4l is renewing the transport on announce/sync timeout, which according to the git log is needed in the client-only mode to not get multicast sockets stuck when the link goes down. I think the fix here is to avoid renewing the raw transport. It shouldn't be necessary if I understand it correctly. Can you please verify that the following change fixes the problem for you? --- a/port.c +++ b/port.c @@ -1806,6 +1806,12 @@ static int port_renew_transport(struct port *p) if (!port_is_enabled(p)) { return 0; } + + /* Closing and binding of raw sockets is too slow and unnecessary */ + if (transport_type(p->trp) == TRANS_IEEE_802_3) { + return 0; + } + transport_close(p->trp, &p->fda); port_clear_fda(p, FD_FIRST_TIMER); res = transport_open(p->trp, p->iface, &p->fda, p->timestamping); -- Miroslav Lichvar _______________________________________________ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel