Hey Adam,

That's interesting, if the patch turned "FAULTY" into "bad message" logs
then it feels somewhat related. The change should only affect packets where
recvmsg successfully returns zero, as you see.
After entering UNCALIBRATED did it eventually SLAVE? (despite the bad
messages being sent each second - annoying for log noise but at least no
longer resetting ptp4l's state machine, you'd see the same if something was
spamming 1 byte payloads that still aren't a valid PTP header ...).

Did you TCPDUMP whilst ptp4l was running? It's possible the master doesn't
send (malformed) status requests to non-active clients, or could be
something to do with multicast subscriptions perhaps. I'd expect the
undersized packet to come amid a flurry of other management GET messages.

Alternatively one can drop such undersized packets before it gets anywhere
near ptp4l. I'm not super knowledgable on firewalls, but this seems to do
the trick, too - ether preventing faults pre-patch, or stopping noisy 'bad
message' logs post-patch

iptables -A INPUT -p udp -m udp --dport 320 -m length --length 28 -j DROP

I have seen this in the wild a couple of times, with PTP master appliance
devices. It's not clear why tcpdump test was negative, but the appearance
of bad messages still sounds promising.

Cheers,
David

PS: it may also be worth looking for a master firmware update. I've heard
rumours one may have been in the pipeline earlier this year, but no
confirmation if that occurred and/or if it addressed this specific issue.
We carry the linked patch to ptp4l just to be sure.

On Thu, 5 Sep 2019 at 01:26, Essling, Adam M <adam.essl...@udri.udayton.edu>
wrote:

> Thank you both for responding so quickly!
>
> David,
> I tried the patch you recommended and but ended up getting this result:
> ptp4l[4756130.278]: port 1: INITIALIZING to LISTENING on INITIALIZE
> ptp4l[4756130.278]: port 0: INITIALIZING to LISTENING on INITIALIZE
> ptp4l[4756130.278]: port 1: link up
> ptp4l[4756130.952]: port 1: bad message
> ptp4l[4756131.951]: port 1: new foreign master 000cec.fffe.0d013a-1
> ptp4l[4756131.952]: port 1: bad message
> ptp4l[4756132.952]: port 1: bad message
> ptp4l[4756133.952]: port 1: bad message
> ptp4l[4756134.952]: port 1: bad message
> ptp4l[4756135.951]: selected best master clock 000cec.fffe.0d013a
> ptp4l[4756135.951]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
> ptp4l[4756135.952]: port 1: bad message
>
> Followed by continuous bad messages after that. I checked the tcpdump and
> none of the UDP messages I saw had a zero payload, so I'm not sure what the
> bad messages are.
>
> I also tried running ptp4l v2.0 and I got a slightly different error
> message:
> ptp4l: [8549.759] selected /dev/ptp0 as PTP clock
> ptp4l: [8549.761] port 1: INITIALIZING to LISTENING on INIT_COMPLETE
> ptp4l: [8549.761] port 0: INITIALIZING to LISTENING on INIT_COMPLETE
> ptp4l: [8550.221] port 1: new foreign master 000cec.fffe.0d013a-1
> ptp4l: [8550.221] recvmsg failed: No such device or address
> ptp4l: [8550.222] port 1: recv message failed
> ptp4l: [8550.222] port 1: LISTENING to FAULTY on FAULT_DETECTED
> (FT_UNSPECIFIED)
> ...
>
> I'm not sure if it's relevant but I have been able to use ptpd2 with both
> slave systems and the Versasync clock with the same network setup.
>
> Richard,
> Here's the info you requested:
>
> [system1] uname -r
> 4.4.38-tegra
>
> [system1] ethtool -i
> driver: eqos
> version:
> firmware-version:
> expansion-rom-version:
> bus-info: 2490000.ether_qos
> supports-statistics: yes
> supports-test: no
> supports-eeprom-access: no
> supports-register-dump: no
> supports-priv-flags: no
>
> [system1] iptables -L
> Chain INPUT (policy ACCEPT)
> target     prot opt source               destination
>
> Chain FORWARD (policy DROP)
> target     prot opt source               destination
> DOCKER-USER  all  --  anywhere             anywhere
> DOCKER-ISOLATION-STAGE-1  all  --  anywhere             anywhere
>
> ACCEPT     all  --  anywhere             anywhere             ctstate
> RELATED,ESTABLISHED
> DOCKER     all  --  anywhere             anywhere
> ACCEPT     all  --  anywhere             anywhere
> ACCEPT     all  --  anywhere             anywhere
>
> Chain OUTPUT (policy ACCEPT)
> target     prot opt source               destination
>
> Chain DOCKER (1 references)
> target     prot opt source               destination
>
> Chain DOCKER-ISOLATION-STAGE-1 (1 references)
> target     prot opt source               destination
> DOCKER-ISOLATION-STAGE-2  all  --  anywhere             anywhere
>
> RETURN     all  --  anywhere             anywhere
>
> Chain DOCKER-ISOLATION-STAGE-2 (1 references)
> target     prot opt source               destination
> DROP       all  --  anywhere             anywhere
> RETURN     all  --  anywhere             anywhere
>
> Chain DOCKER-USER (1 references)
> target     prot opt source               destination
> RETURN     all  --  anywhere             anywhere
>
>
> [system2] uname -r
> 4.15.0-47-generic
>
> [system2] ethtool -i
> driver: igb
> version: 5.4.0-k
> firmware-version: 3.25, 0x800005d0
> expansion-rom-version:
> bus-info: 0000:08:00.0
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: yes
>
> [system2] iptables -L
> Chain INPUT (policy ACCEPT)
> target     prot opt source               destination
>
> Chain FORWARD (policy DROP)
> target     prot opt source               destination
> DOCKER-USER  all  --  anywhere             anywhere
> DOCKER-ISOLATION-STAGE-1  all  --  anywhere             anywhere
>
> ACCEPT     all  --  anywhere             anywhere             ctstate
> RELATED,ESTABLISHED
> DOCKER     all  --  anywhere             anywhere
> ACCEPT     all  --  anywhere             anywhere
> ACCEPT     all  --  anywhere             anywhere
> ACCEPT     all  --  anywhere             anywhere             ctstate
> RELATED,ESTABLISHED
> DOCKER     all  --  anywhere             anywhere
> ACCEPT     all  --  anywhere             anywhere
> ACCEPT     all  --  anywhere             anywhere
>
> Chain OUTPUT (policy ACCEPT)
> target     prot opt source               destination
>
> Chain DOCKER (2 references)
> target     prot opt source               destination
> ACCEPT     tcp  --  anywhere             172.17.0.2           tcp dpt:5000
> ACCEPT     tcp  --  anywhere             172.18.0.2           tcp dpt:9000
>
> Chain DOCKER-ISOLATION-STAGE-1 (1 references)
> target     prot opt source               destination
> DOCKER-ISOLATION-STAGE-2  all  --  anywhere             anywhere
>
> DOCKER-ISOLATION-STAGE-2  all  --  anywhere             anywhere
>
> RETURN     all  --  anywhere             anywhere
>
> Chain DOCKER-ISOLATION-STAGE-2 (2 references)
> target     prot opt source               destination
> DROP       all  --  anywhere             anywhere
> DROP       all  --  anywhere             anywhere
> RETURN     all  --  anywhere             anywhere
>
> Chain DOCKER-USER (1 references)
> target     prot opt source               destination
> RETURN     all  --  anywhere             anywhere
>
> Thanks,
> Adam
> ________________________________________
> From: Richard Cochran [richardcoch...@gmail.com]
> Sent: Tuesday, September 03, 2019 11:08 PM
> To: Essling, Adam M
> Cc: linuxptp-users@lists.sourceforge.net
> Subject: Re: [Linuxptp-users] recvmsg failed error on multiple slave
> systems
>
> On Tue, Sep 03, 2019 at 09:00:08PM +0000, Essling, Adam M wrote:
> > Hi, I'm trying to use ptp4l with [system1] and [system2] as slaves with
> a Spectracom Versasync PTP clock as master. Both slave systems are running
> Ubuntu 16.04 with ptp4l v1.8. I am using the default ptp4l.conf file. When
> I run the following command (on system1):
> >
> > sudo ptp4l -i eth0 -f /etc/linuxptp/ptp4l.conf -m
> >
> > I get the following output:
> > ptp4l: [7769.535] selected /dev/ptp0 as PTP clock
> > ptp4l: [7769.537] port 1: INITIALIZING to LISTENING on INITIALIZE
> > ptp4l: [7769.537] port 0: INITIALIZING to LISTENING on INITIALIZE
> > ptp4l: [7769.537] port 1: link up
> > ptp4l: [7770.221] port 1: new foreign master 000cec.fffe.xxxxxx-1
>
> So we did receive an Announce message (I guess on the general port).
>
> > ptp4l: [7770.221] recvmsg failed: No such file or directory
>
> But here recvmsg() returns ENOENT.  Strange.
>
> That error isn't listed on the man page.  I briefly scanned the kernel
> stack, and there are indeed a few cases where ENOENT can be returned,
> but I didn't see anything that could apply in this case.
>
> Could this possibly be due to a firewall?
>
> > I know the recvmsg failed error has something to do with the ptp4l
> > socket but I'm not sure how to go about fixing it. Please let me
> > know if more information is needed.
>
> - uname -r
> - ethtool -i
> - iptables -L
>
> Thanks,
> Richard
>
_______________________________________________
Linuxptp-users mailing list
Linuxptp-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-users

Reply via email to