On Wed, May 26, 2021 at 05:45:29PM +0000, Amar Subramanyam wrote:
> Hi Miroslav,
> 
> We were able to reproduce the issue even without running phc2sys.
> Please find the attached strace and ptp4l logs when this issue is seen.

Ok, thanks. That's very helpful.

> From the ptp4l log, we can see that BMCA took 148 msec to run(Mono Interval 
> :: 148445967).
> The same can be observed from strace logs. In the attached strace log, BMCA 
> is executed between the timestamps 00:07:27.047830 (line number 53) to 
> 00:07:27.196275(line number 102).

I don't see that in the log.

> 00:07:27.142942 close(4)                = 0
> 00:07:27.153733 close(15)               = 0
> 00:07:27.167783 socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP) = 4
> 00:07:27.167829 ioctl(4, SIOCGIFHWADDR, {ifr_name="sriov0", 
> ifr_hwaddr={sa_family=ARPHRD_ETHER, sa_data=64:4c:36:12:55:e0}}) = 0
> 00:07:27.167878 close(4)                = 0
> 00:07:27.167964 socket(AF_PACKET, SOCK_RAW, 768) = 4
> 00:07:27.168023 ioctl(4, SIOCGIFINDEX, {ifr_name="sriov0", }) = 0
> 00:07:27.168095 bind(4, {sa_family=AF_PACKET, sll_protocol=htons(ETH_P_ALL), 
> sll_ifindex=if_nametoindex("sriov0"), sll_hatype=ARPHRD_NETROM, 
> sll_pkttype=PACKET_HOST, sll_halen=0}, 20) = 0
> 00:07:27.178781 setsockopt(4, SOL_SOCKET, SO_BINDTODEVICE, "sriov0", 6) = 0
> 00:07:27.178830 setsockopt(4, SOL_SOCKET, SO_ATTACH_FILTER, {len=12, 
> filter=0x635ae0}, 16) = 0
> 00:07:27.178875 setsockopt(4, SOL_PACKET, PACKET_ADD_MEMBERSHIP, 
> {mr_ifindex=if_nametoindex("sriov0"), mr_type=PACKET_MR_MULTICAST, mr_alen=6, 
> mr_address=011b19000000}, 16) = 0
> 00:07:27.179151 setsockopt(4, SOL_PACKET, PACKET_ADD_MEMBERSHIP, 
> {mr_ifindex=if_nametoindex("sriov0"), mr_type=PACKET_MR_MULTICAST, mr_alen=6, 
> mr_address=0180c200000e}, 16) = 0
> 00:07:27.179415 socket(AF_PACKET, SOCK_RAW, 768) = 15
> 00:07:27.179482 ioctl(15, SIOCGIFINDEX, {ifr_name="sriov0", }) = 0
> 00:07:27.179518 bind(15, {sa_family=AF_PACKET, sll_protocol=htons(ETH_P_ALL), 
> sll_ifindex=if_nametoindex("sriov0"), sll_hatype=ARPHRD_NETROM, 
> sll_pkttype=PACKET_HOST, sll_halen=0}, 20) = 0
> 00:07:27.193807 setsockopt(15, SOL_SOCKET, SO_BINDTODEVICE, "sriov0", 6) = 0

What I see is that it's the closing and binding of the raw sockets
that is slowing down ptp4l so much that a received message waits for
up to ~40 milliseconds before the clock check gets its timestamp.

ptp4l is renewing the transport on announce/sync timeout, which
according to the git log is needed in the client-only mode to not get
multicast sockets stuck when the link goes down. I think the fix here
is to avoid renewing the raw transport. It shouldn't be necessary if I
understand it correctly.

Can you please verify that the following change fixes the problem for
you?

--- a/port.c
+++ b/port.c
@@ -1806,6 +1806,12 @@ static int port_renew_transport(struct port *p)
        if (!port_is_enabled(p)) {
                return 0;
        }
+
+       /* Closing and binding of raw sockets is too slow and unnecessary */
+       if (transport_type(p->trp) == TRANS_IEEE_802_3) {
+               return 0;
+       }
+
        transport_close(p->trp, &p->fda);
        port_clear_fda(p, FD_FIRST_TIMER);
        res = transport_open(p->trp, p->iface, &p->fda, p->timestamping);

-- 
Miroslav Lichvar



_______________________________________________
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel

Reply via email to