> -----Original Message----- > From: Vladimir Oltean <olte...@gmail.com> > Sent: Monday, December 16, 2019 3:10 PM > To: richardcoch...@gmail.com > Cc: linuxptp-devel@lists.sourceforge.net > Subject: [Linuxptp-devel] [PATCH 1/3] ptp4l: Call recvmsg() with the > MSG_DONTWAIT flag > > The application's main event loop (clock_poll) is woken up by poll() and > dispatches the socket receive queue events to the corresponding ports as > needed. > > So it is a bug if poll() wakes up the process for data availability on a > socket's receive queue, and then recvmsg(), called immediately > afterwards, goes to sleep trying to retrieve it. This patch will > generate an error that will be propagated to the user if this condition > happens. > > Can it happen? > > As of this patch, ptp4l uses the SO_SELECT_ERR_QUEUE socket option, > which means that poll() will wake the process up, with revents == > (POLLIN | POLLERR), if data is available in the error queue. But > clock_poll() does not check POLLERR, just POLLIN, and draws the wrong > conclusion that there is data available in the receive queue (when it is > in fact available in the error queue). > > When the above condition happens, recvmsg() will sleep typically for a > whole sync interval waiting for data on the event socket, and will be > woken up when the new real frame arrives. It will not dequeue follow-up > messages during this time (which are sent to the general message socket) > and when it does, it will already be late for them (their seqid will be > out of order). So it will drop them and everything that comes after. The > synchronization process will fail. > > The above condition shouldn't typically happen, but exceptional kernel > events will trigger it. It helps to be strict in ptp4l in order for > those events to not blow up in even stranger symptoms unrelated to the > root cause of the problem. >
Yep, makes sense. > Signed-off-by: Vladimir Oltean <olte...@gmail.com> > --- > raw.c | 2 +- > udp.c | 2 +- > udp6.c | 2 +- > 3 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/raw.c b/raw.c > index f1c92b9f8d90..c0a8cd80855f 100644 > --- a/raw.c > +++ b/raw.c > @@ -278,7 +278,7 @@ static int raw_recv(struct transport *t, int fd, void > *buf, int > buflen, > buflen += hlen; > hdr = (struct eth_hdr *) ptr; > > - cnt = sk_receive(fd, ptr, buflen, addr, hwts, 0); > + cnt = sk_receive(fd, ptr, buflen, addr, hwts, MSG_DONTWAIT); > So basically we just indicate that we do not want to sleep. Good. This makes sense to me. Thanks, Jake > if (cnt >= 0) > cnt -= hlen; > diff --git a/udp.c b/udp.c > index 48af482b4526..eb1617872f37 100644 > --- a/udp.c > +++ b/udp.c > @@ -210,7 +210,7 @@ no_event: > static int udp_recv(struct transport *t, int fd, void *buf, int buflen, > struct address *addr, struct hw_timestamp *hwts) > { > - return sk_receive(fd, buf, buflen, addr, hwts, 0); > + return sk_receive(fd, buf, buflen, addr, hwts, MSG_DONTWAIT); > } > > static int udp_send(struct transport *t, struct fdarray *fda, > diff --git a/udp6.c b/udp6.c > index 74ebc7f0cf09..06c6fad2160f 100644 > --- a/udp6.c > +++ b/udp6.c > @@ -227,7 +227,7 @@ no_event: > static int udp6_recv(struct transport *t, int fd, void *buf, int buflen, > struct address *addr, struct hw_timestamp *hwts) > { > - return sk_receive(fd, buf, buflen, addr, hwts, 0); > + return sk_receive(fd, buf, buflen, addr, hwts, MSG_DONTWAIT); > } > > static int udp6_send(struct transport *t, struct fdarray *fda, > -- > 2.17.1 > > > > _______________________________________________ > Linuxptp-devel mailing list > Linuxptp-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/linuxptp-devel _______________________________________________ Linuxptp-devel mailing list Linuxptp-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-devel