> -----Original Message-----
> From: Vladimir Oltean <olte...@gmail.com>
> Sent: Monday, December 16, 2019 3:10 PM
> To: richardcoch...@gmail.com
> Cc: linuxptp-devel@lists.sourceforge.net
> Subject: [Linuxptp-devel] [PATCH 1/3] ptp4l: Call recvmsg() with the
> MSG_DONTWAIT flag
> 
> The application's main event loop (clock_poll) is woken up by poll() and
> dispatches the socket receive queue events to the corresponding ports as
> needed.
> 
> So it is a bug if poll() wakes up the process for data availability on a
> socket's receive queue, and then recvmsg(), called immediately
> afterwards, goes to sleep trying to retrieve it. This patch will
> generate an error that will be propagated to the user if this condition
> happens.
> 
> Can it happen?
> 
> As of this patch, ptp4l uses the SO_SELECT_ERR_QUEUE socket option,
> which means that poll() will wake the process up, with revents ==
> (POLLIN | POLLERR), if data is available in the error queue. But
> clock_poll() does not check POLLERR, just POLLIN, and draws the wrong
> conclusion that there is data available in the receive queue (when it is
> in fact available in the error queue).
> 
> When the above condition happens, recvmsg() will sleep typically for a
> whole sync interval waiting for data on the event socket, and will be
> woken up when the new real frame arrives. It will not dequeue follow-up
> messages during this time (which are sent to the general message socket)
> and when it does, it will already be late for them (their seqid will be
> out of order). So it will drop them and everything that comes after. The
> synchronization process will fail.
> 
> The above condition shouldn't typically happen, but exceptional kernel
> events will trigger it. It helps to be strict in ptp4l in order for
> those events to not blow up in even stranger symptoms unrelated to the
> root cause of the problem.
> 

Yep, makes sense.

> Signed-off-by: Vladimir Oltean <olte...@gmail.com>
> ---
>  raw.c  | 2 +-
>  udp.c  | 2 +-
>  udp6.c | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/raw.c b/raw.c
> index f1c92b9f8d90..c0a8cd80855f 100644
> --- a/raw.c
> +++ b/raw.c
> @@ -278,7 +278,7 @@ static int raw_recv(struct transport *t, int fd, void 
> *buf, int
> buflen,
>       buflen += hlen;
>       hdr = (struct eth_hdr *) ptr;
> 
> -     cnt = sk_receive(fd, ptr, buflen, addr, hwts, 0);
> +     cnt = sk_receive(fd, ptr, buflen, addr, hwts, MSG_DONTWAIT);
> 

So basically we just indicate that we do not want to sleep. Good.

This makes sense to me.

Thanks,
Jake

>       if (cnt >= 0)
>               cnt -= hlen;
> diff --git a/udp.c b/udp.c
> index 48af482b4526..eb1617872f37 100644
> --- a/udp.c
> +++ b/udp.c
> @@ -210,7 +210,7 @@ no_event:
>  static int udp_recv(struct transport *t, int fd, void *buf, int buflen,
>                   struct address *addr, struct hw_timestamp *hwts)
>  {
> -     return sk_receive(fd, buf, buflen, addr, hwts, 0);
> +     return sk_receive(fd, buf, buflen, addr, hwts, MSG_DONTWAIT);
>  }
> 
>  static int udp_send(struct transport *t, struct fdarray *fda,
> diff --git a/udp6.c b/udp6.c
> index 74ebc7f0cf09..06c6fad2160f 100644
> --- a/udp6.c
> +++ b/udp6.c
> @@ -227,7 +227,7 @@ no_event:
>  static int udp6_recv(struct transport *t, int fd, void *buf, int buflen,
>                    struct address *addr, struct hw_timestamp *hwts)
>  {
> -     return sk_receive(fd, buf, buflen, addr, hwts, 0);
> +     return sk_receive(fd, buf, buflen, addr, hwts, MSG_DONTWAIT);
>  }
> 
>  static int udp6_send(struct transport *t, struct fdarray *fda,
> --
> 2.17.1
> 
> 
> 
> _______________________________________________
> Linuxptp-devel mailing list
> Linuxptp-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linuxptp-devel


_______________________________________________
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel

Reply via email to