This is a note to let you know that I've just added the patch titled
af_unix: fix EPOLLET regression for stream sockets
to the 3.2-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
The filename of the patch is:
af_unix-fix-epollet-regression-for-stream-sockets.patch
and it can be found in the queue-3.2 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <[email protected]> know about it.
>From 1bb0e966ae02704e5a2f5915bde190f2ce2b32fc Mon Sep 17 00:00:00 2001
From: Eric Dumazet <[email protected]>
Date: Sat, 28 Jan 2012 16:11:03 +0000
Subject: af_unix: fix EPOLLET regression for stream sockets
From: Eric Dumazet <[email protected]>
[ Upstream commit 6f01fd6e6f6809061b56e78f1e8d143099716d70 ]
Commit 0884d7aa24 (AF_UNIX: Fix poll blocking problem when reading from
a stream socket) added a regression for epoll() in Edge Triggered mode
(EPOLLET)
Appropriate fix is to use skb_peek()/skb_unlink() instead of
skb_dequeue(), and only call skb_unlink() when skb is fully consumed.
This remove the need to requeue a partial skb into sk_receive_queue head
and the extra sk->sk_data_ready() calls that added the regression.
This is safe because once skb is given to sk_receive_queue, it is not
modified by a writer, and readers are serialized by u->readlock mutex.
This also reduce number of spinlock acquisition for small reads or
MSG_PEEK users so should improve overall performance.
Reported-by: Nick Mathewson <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Alexey Moiseytsev <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/unix/af_unix.c | 19 ++++---------------
1 file changed, 4 insertions(+), 15 deletions(-)
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1915,7 +1915,7 @@ static int unix_stream_recvmsg(struct ki
struct sk_buff *skb;
unix_state_lock(sk);
- skb = skb_dequeue(&sk->sk_receive_queue);
+ skb = skb_peek(&sk->sk_receive_queue);
if (skb == NULL) {
unix_sk(sk)->recursion_level = 0;
if (copied >= target)
@@ -1955,11 +1955,8 @@ static int unix_stream_recvmsg(struct ki
if (check_creds) {
/* Never glue messages from different writers */
if ((UNIXCB(skb).pid != siocb->scm->pid) ||
- (UNIXCB(skb).cred != siocb->scm->cred)) {
- skb_queue_head(&sk->sk_receive_queue, skb);
- sk->sk_data_ready(sk, skb->len);
+ (UNIXCB(skb).cred != siocb->scm->cred))
break;
- }
} else {
/* Copy credentials */
scm_set_cred(siocb->scm, UNIXCB(skb).pid,
UNIXCB(skb).cred);
@@ -1974,8 +1971,6 @@ static int unix_stream_recvmsg(struct ki
chunk = min_t(unsigned int, skb->len, size);
if (memcpy_toiovec(msg->msg_iov, skb->data, chunk)) {
- skb_queue_head(&sk->sk_receive_queue, skb);
- sk->sk_data_ready(sk, skb->len);
if (copied == 0)
copied = -EFAULT;
break;
@@ -1990,13 +1985,10 @@ static int unix_stream_recvmsg(struct ki
if (UNIXCB(skb).fp)
unix_detach_fds(siocb->scm, skb);
- /* put the skb back if we didn't use it up.. */
- if (skb->len) {
- skb_queue_head(&sk->sk_receive_queue, skb);
- sk->sk_data_ready(sk, skb->len);
+ if (skb->len)
break;
- }
+ skb_unlink(skb, &sk->sk_receive_queue);
consume_skb(skb);
if (siocb->scm->fp)
@@ -2007,9 +1999,6 @@ static int unix_stream_recvmsg(struct ki
if (UNIXCB(skb).fp)
siocb->scm->fp = scm_fp_dup(UNIXCB(skb).fp);
- /* put message back and return */
- skb_queue_head(&sk->sk_receive_queue, skb);
- sk->sk_data_ready(sk, skb->len);
break;
}
} while (size);
Patches currently in stable-queue which might be from [email protected] are
queue-3.2/net-reintroduce-missing-rcu_assign_pointer-calls.patch
queue-3.2/net-bpf_jit-fix-divide-by-0-generation.patch
queue-3.2/af_unix-fix-epollet-regression-for-stream-sockets.patch
queue-3.2/macvlan-fix-a-possible-use-after-free.patch
queue-3.2/netns-fix-net_alloc_generic.patch
queue-3.2/l2tp-l2tp_ip-fix-possible-oops-on-packet-receive.patch
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html