On Mon, Jun 28, 2021 at 3:03 PM Xin Long <lucien....@gmail.com> wrote: > > On Sun, Jun 27, 2021 at 3:44 PM Erin Shepherd <erin.sheph...@e43.eu> wrote: > > > > Xin Long <lucien....@gmail.com> writes: > > > > > Currently, when userspace reads a datagram with a buffer that is > > > smaller than this datagram, the data will be truncated and only > > > part of it can be received by users. It doesn't seem right that > > > users don't know the datagram size and have to use a huge buffer > > > to read it to avoid the truncation. > > > > > > This patch to fix it by keeping the skb in rcv queue until the > > > whole data is read by users. Only the last msg of the datagram > > > will be marked with MSG_EOR, just as TCP/SCTP does. > > > > I agree that the current behavior is suboptimal, but: > > > > * Isn't this the same behavior that other datagram socket types > > exhibit? It seems like this would make TIPC behave inconsistently > > compared to other transports > Yes, SCTP. > Do you see any reliable datagram transports not doing this? > > > * Isn't this a compatibility break with existing software? Particularly > > existing software will not expect to receive trailers of overlong > > datagrams > I talked to Jon about this, he seems okay with this. > > > > > It feels like this behavior should be activated either with a > > setsockopt(2) call or a new MSG_* flag passed to recv > Anyway, It may not be worth a new sockopt. > I'm thinking to pass MSG_EOR into sendmsg: > sendmsg(MSG_EOR). sorry, I meant recvmsg(); > to indicate we don't want the truncating msg. > > When the msg flag returns with no MSG_EOR, it means there's more data to read. > > Thanks. > > > > - Erin > > > > > Signed-off-by: Xin Long <lucien....@gmail.com> > > > --- > > > net/tipc/socket.c | 30 +++++++++++++++++++++--------- > > > 1 file changed, 21 insertions(+), 9 deletions(-) > > > > > > diff --git a/net/tipc/socket.c b/net/tipc/socket.c > > > index 34a97ea36cc8..504e59838b8b 100644 > > > --- a/net/tipc/socket.c > > > +++ b/net/tipc/socket.c > > > @@ -1880,6 +1880,7 @@ static int tipc_recvmsg(struct socket *sock, struct > > > msghdr *m, > > > bool connected = !tipc_sk_type_connectionless(sk); > > > struct tipc_sock *tsk = tipc_sk(sk); > > > int rc, err, hlen, dlen, copy; > > > + struct tipc_skb_cb *skb_cb; > > > struct sk_buff_head xmitq; > > > struct tipc_msg *hdr; > > > struct sk_buff *skb; > > > @@ -1903,6 +1904,7 @@ static int tipc_recvmsg(struct socket *sock, struct > > > msghdr *m, > > > if (unlikely(rc)) > > > goto exit; > > > skb = skb_peek(&sk->sk_receive_queue); > > > + skb_cb = TIPC_SKB_CB(skb); > > > hdr = buf_msg(skb); > > > dlen = msg_data_sz(hdr); > > > hlen = msg_hdr_sz(hdr); > > > @@ -1922,18 +1924,27 @@ static int tipc_recvmsg(struct socket *sock, > > > struct msghdr *m, > > > > > > /* Capture data if non-error msg, otherwise just set return value */ > > > if (likely(!err)) { > > > - copy = min_t(int, dlen, buflen); > > > - if (unlikely(copy != dlen)) > > > - m->msg_flags |= MSG_TRUNC; > > > - rc = skb_copy_datagram_msg(skb, hlen, m, copy); > > > + int offset = skb_cb->bytes_read; > > > + > > > + copy = min_t(int, dlen - offset, buflen); > > > + rc = skb_copy_datagram_msg(skb, hlen + offset, m, copy); > > > + if (unlikely(rc)) > > > + goto exit; > > > + if (unlikely(offset + copy < dlen)) { > > > + if (!(flags & MSG_PEEK)) > > > + skb_cb->bytes_read = offset + copy; > > > + } else { > > > + m->msg_flags |= MSG_EOR; > > > + skb_cb->bytes_read = 0; > > > + } > > > } else { > > > copy = 0; > > > rc = 0; > > > - if (err != TIPC_CONN_SHUTDOWN && connected && > > > !m->msg_control) > > > + if (err != TIPC_CONN_SHUTDOWN && connected && > > > !m->msg_control) { > > > rc = -ECONNRESET; > > > + goto exit; > > > + } > > > } > > > - if (unlikely(rc)) > > > - goto exit; > > > > > > /* Mark message as group event if applicable */ > > > if (unlikely(grp_evt)) { > > > @@ -1956,9 +1967,10 @@ static int tipc_recvmsg(struct socket *sock, > > > struct msghdr *m, > > > tipc_node_distr_xmit(sock_net(sk), &xmitq); > > > } > > > > > > - tsk_advance_rx_queue(sk); > > > + if (!skb_cb->bytes_read) > > > + tsk_advance_rx_queue(sk); > > > > > > - if (likely(!connected)) > > > + if (likely(!connected) || skb_cb->bytes_read) > > > goto exit; > > > > > > /* Send connection flow control advertisement when applicable */ > > > -- > > > 2.27.0 > > > > > > > > > > > > _______________________________________________ > > > tipc-discussion mailing list > > > tipc-discussion@lists.sourceforge.net > > > https://lists.sourceforge.net/lists/listinfo/tipc-discussion
_______________________________________________ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion