On 28/06/2021 15:16, Xin Long wrote:
On Mon, Jun 28, 2021 at 3:03 PM Xin Long <lucien....@gmail.com> wrote:
On Sun, Jun 27, 2021 at 3:44 PM Erin Shepherd <erin.sheph...@e43.eu> wrote:
Xin Long <lucien....@gmail.com> writes:
Currently, when userspace reads a datagram with a buffer that is
smaller than this datagram, the data will be truncated and only
part of it can be received by users. It doesn't seem right that
users don't know the datagram size and have to use a huge buffer
to read it to avoid the truncation.
This patch to fix it by keeping the skb in rcv queue until the
whole data is read by users. Only the last msg of the datagram
will be marked with MSG_EOR, just as TCP/SCTP does.
Makes sense to me.
I agree that the current behavior is suboptimal, but:
* Isn't this the same behavior that other datagram socket types
exhibit? It seems like this would make TIPC behave inconsistently
compared to other transports
Yes, SCTP.
Do you see any reliable datagram transports not doing this?
* Isn't this a compatibility break with existing software? Particularly
existing software will not expect to receive trailers of overlong
datagrams
I talked to Jon about this, he seems okay with this.
It feels like this behavior should be activated either with a
setsockopt(2) call or a new MSG_* flag passed to recv
Anyway, It may not be worth a new sockopt.
I'm thinking to pass MSG_EOR into sendmsg:
sendmsg(MSG_EOR).
sorry, I meant recvmsg();
Still not sure I understand what you are suggesting here. Do you mean
that if we add MSG_EOR as a flagĀ to recvmsg() that means we *don't*
want the remainder of the message, i.e., it is ok to truncate it?
Or do you mean the opposite?
In the first case, we don't solve any compatibility issue, if that is
the purpose. The programmer still has to add code to get the current
behavior.
In the latter case we would be on the 100% safe side, although I have a
real hard time to see that this could be a real issue. Why would anybody
deliberately design an application for having messages truncated.
///jon
to indicate we don't want the truncating msg.
When the msg flag returns with no MSG_EOR, it means there's more data to read.
Thanks.
- Erin
Signed-off-by: Xin Long <lucien....@gmail.com>
---
net/tipc/socket.c | 30 +++++++++++++++++++++---------
1 file changed, 21 insertions(+), 9 deletions(-)
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 34a97ea36cc8..504e59838b8b 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -1880,6 +1880,7 @@ static int tipc_recvmsg(struct socket *sock, struct
msghdr *m,
bool connected = !tipc_sk_type_connectionless(sk);
struct tipc_sock *tsk = tipc_sk(sk);
int rc, err, hlen, dlen, copy;
+ struct tipc_skb_cb *skb_cb;
struct sk_buff_head xmitq;
struct tipc_msg *hdr;
struct sk_buff *skb;
@@ -1903,6 +1904,7 @@ static int tipc_recvmsg(struct socket *sock, struct
msghdr *m,
if (unlikely(rc))
goto exit;
skb = skb_peek(&sk->sk_receive_queue);
+ skb_cb = TIPC_SKB_CB(skb);
hdr = buf_msg(skb);
dlen = msg_data_sz(hdr);
hlen = msg_hdr_sz(hdr);
@@ -1922,18 +1924,27 @@ static int tipc_recvmsg(struct socket *sock, struct
msghdr *m,
/* Capture data if non-error msg, otherwise just set return value */
if (likely(!err)) {
- copy = min_t(int, dlen, buflen);
- if (unlikely(copy != dlen))
- m->msg_flags |= MSG_TRUNC;
- rc = skb_copy_datagram_msg(skb, hlen, m, copy);
+ int offset = skb_cb->bytes_read;
+
+ copy = min_t(int, dlen - offset, buflen);
+ rc = skb_copy_datagram_msg(skb, hlen + offset, m, copy);
+ if (unlikely(rc))
+ goto exit;
+ if (unlikely(offset + copy < dlen)) {
+ if (!(flags & MSG_PEEK))
+ skb_cb->bytes_read = offset + copy;
+ } else {
+ m->msg_flags |= MSG_EOR;
+ skb_cb->bytes_read = 0;
+ }
} else {
copy = 0;
rc = 0;
- if (err != TIPC_CONN_SHUTDOWN && connected && !m->msg_control)
+ if (err != TIPC_CONN_SHUTDOWN && connected && !m->msg_control) {
rc = -ECONNRESET;
+ goto exit;
+ }
}
- if (unlikely(rc))
- goto exit;
/* Mark message as group event if applicable */
if (unlikely(grp_evt)) {
@@ -1956,9 +1967,10 @@ static int tipc_recvmsg(struct socket *sock, struct
msghdr *m,
tipc_node_distr_xmit(sock_net(sk), &xmitq);
}
- tsk_advance_rx_queue(sk);
+ if (!skb_cb->bytes_read)
+ tsk_advance_rx_queue(sk);
- if (likely(!connected))
+ if (likely(!connected) || skb_cb->bytes_read)
goto exit;
/* Send connection flow control advertisement when applicable */
--
2.27.0
_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion
_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion
_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion