Re: [PATCH] fix race in AF_UNIX

2007-06-26 Thread Eric W. Biederman
Miklos Szeredi [EMAIL PROTECTED] writes: And I think incremental GC algorithms are much too complex for this task. What I've realized, is that in fact we don't require a generic garbage collection algorithm, just a much more specialized cycle collection algorithm, since refcounting in struct

Re: [PATCH] fix race in AF_UNIX

2007-06-23 Thread Miklos Szeredi
Eric, thanks for looking at this. There are races involving the garbage collector, that can throw away perfectly good packets with AF_UNIX sockets in them. The problems arise when a socket goes from installed to in-flight or vice versa during garbage collection. Since gc is done

Re: [PATCH] fix race in AF_UNIX

2007-06-23 Thread Eric W. Biederman
Miklos Szeredi [EMAIL PROTECTED] writes: Right. But the devil is in the details, and (as you correctly point out later) to implement this, the whole locking scheme needs to be overhauled. Problems: - Using the queue lock to make the dequeue and the fd detach atomic wrt the GC is

Re: [PATCH] fix race in AF_UNIX

2007-06-21 Thread Eric W. Biederman
Miklos Szeredi [EMAIL PROTECTED] writes: [CC'd Al Viro and Alan Cox, restored patch] There are races involving the garbage collector, that can throw away perfectly good packets with AF_UNIX sockets in them. The problems arise when a socket goes from installed to in-flight or vice

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread Miklos Szeredi
Ping Dave, Since there doesn't seem to be any new ideas forthcoming, can we please decide on either one of my two sumbitted patches? Thanks, Miklos [CC'd Al Viro and Alan Cox, restored patch] There are races involving the garbage collector, that can throw away perfectly good packets

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread David Miller
From: Miklos Szeredi [EMAIL PROTECTED] Date: Mon, 18 Jun 2007 09:49:32 +0200 Ping Dave, Since there doesn't seem to be any new ideas forthcoming, can we please decide on either one of my two sumbitted patches? You just need to be patient, we can't just put a bad patch in because a better

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread Miklos Szeredi
From: Miklos Szeredi [EMAIL PROTECTED] Date: Mon, 18 Jun 2007 09:49:32 +0200 Ping Dave, Since there doesn't seem to be any new ideas forthcoming, can we please decide on either one of my two sumbitted patches? You just need to be patient, we can't just put a bad patch in because a

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread David Miller
From: Miklos Szeredi [EMAIL PROTECTED] Date: Mon, 18 Jun 2007 10:20:23 +0200 And is anyone working on a better patch? I have no idea. Those patches aren't bad in the correctness sense. So IMO any one of them is better, than having that bug in there. You're adding a very serious performance

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread Miklos Szeredi
And is anyone working on a better patch? I have no idea. Those patches aren't bad in the correctness sense. So IMO any one of them is better, than having that bug in there. You're adding a very serious performance regression, which is about as bad as the bug itself. No,

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread David Miller
From: Miklos Szeredi [EMAIL PROTECTED] Date: Mon, 18 Jun 2007 11:29:52 +0200 And is anyone working on a better patch? I have no idea. Those patches aren't bad in the correctness sense. So IMO any one of them is better, than having that bug in there. You're adding a very

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread Miklos Szeredi
And is anyone working on a better patch? I have no idea. Those patches aren't bad in the correctness sense. So IMO any one of them is better, than having that bug in there. You're adding a very serious performance regression, which is about as bad as the bug

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread David Miller
From: Miklos Szeredi [EMAIL PROTECTED] Date: Mon, 18 Jun 2007 11:44:07 +0200 Secondarily, this bug has been around for years and nobody noticed. The world will not explode if this bug takes a few more days or even a week to work out. Let's do it right instead of ramming arbitrary turds

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread Miklos Szeredi
Secondarily, this bug has been around for years and nobody noticed. The world will not explode if this bug takes a few more days or even a week to work out. Let's do it right instead of ramming arbitrary turds into the kernel. Fine, but just wishing a bug to get fixed won't

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread David Miller
From: Miklos Szeredi [EMAIL PROTECTED] Date: Mon, 18 Jun 2007 11:55:19 +0200 It's doing trylocks and releasing all those locks within the same spin locked region, how the f**k can that deadlock? The case I'm very concerned about is the interactions of your new code with the

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread Thomas Graf
* Miklos Szeredi [EMAIL PROTECTED] 2007-06-18 11:44 Garbage collection only ever happens, if the app is sending AF_UNIX sockets over AF_UNIX sockets. Which is a rather rare case. And which is basically why this bug went unnoticed for so long. So my second patch only affects the performance

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread Miklos Szeredi
* Miklos Szeredi [EMAIL PROTECTED] 2007-06-18 11:44 Garbage collection only ever happens, if the app is sending AF_UNIX sockets over AF_UNIX sockets. Which is a rather rare case. And which is basically why this bug went unnoticed for so long. So my second patch only affects the

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread Thomas Graf
* Thomas Graf [EMAIL PROTECTED] 2007-06-18 12:32 * Miklos Szeredi [EMAIL PROTECTED] 2007-06-18 11:44 Garbage collection only ever happens, if the app is sending AF_UNIX sockets over AF_UNIX sockets. Which is a rather rare case. And which is basically why this bug went unnoticed for so

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread Thomas Graf
* Miklos Szeredi [EMAIL PROTECTED] 2007-06-18 12:39 You are wrong. Look in unix_release_sock(): if (atomic_read(unix_tot_inflight)) unix_gc(); /* Garbage collect fds */ unix_tot_inflight is the number of AF_UNIX sockets currently being transferred over

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread Miklos Szeredi
* Thomas Graf [EMAIL PROTECTED] 2007-06-18 12:32 * Miklos Szeredi [EMAIL PROTECTED] 2007-06-18 11:44 Garbage collection only ever happens, if the app is sending AF_UNIX sockets over AF_UNIX sockets. Which is a rather rare case. And which is basically why this bug went unnoticed for

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread David Miller
From: Miklos Szeredi [EMAIL PROTECTED] Date: Mon, 18 Jun 2007 12:47:17 +0200 but it's not as if it's really going to affect performance in real cases. Since these circumstances are creatable by any user, we have to consider the cases caused by malicious entities. - To unsubscribe from this

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread Miklos Szeredi
but it's not as if it's really going to affect performance in real cases. Since these circumstances are creatable by any user, we have to consider the cases caused by malicious entities. OK. But then the whole gc thing is already broken, since a user can DoS socket creation/destruction.

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread David Miller
From: Miklos Szeredi [EMAIL PROTECTED] Date: Mon, 18 Jun 2007 12:55:24 +0200 I'm all for fixing this gc mess that we have now. But please don't expect me to be the one who's doing it. Don't worry, I only expect you to make the situation worse :-) - To unsubscribe from this list: send the line

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread Miklos Szeredi
I'm all for fixing this gc mess that we have now. But please don't expect me to be the one who's doing it. Don't worry, I only expect you to make the situation worse :-) That's real nice. Looks like finding and fixing bugs in not appreciated in the networking subsystem :-/ Miklos - To

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread David Miller
From: David Miller [EMAIL PROTECTED] Date: Mon, 18 Jun 2007 04:02:53 -0700 (PDT) From: Miklos Szeredi [EMAIL PROTECTED] Date: Mon, 18 Jun 2007 12:55:24 +0200 I'm all for fixing this gc mess that we have now. But please don't expect me to be the one who's doing it. Don't worry, I only

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread Alan Cox
No, correctness always trumps performance. Lost packets on an AF_UNIX socket are _unexceptable_, and this is definitely not a theoretical problem. If its so unacceptable why has nobody noticed until now - its a bug clearly, it needs fixing clearly, but unless you can produce some kind of

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread Jan Engelhardt
On Jun 18 2007 12:47, Alan Cox wrote: Do you want me to send the patch to Andrew instead? His attitude towards bugfixes is rather better ;) And it'll get NAKked and binned. DaveM is (as happens sometimes ;)) right to insist on the code being clean and efficient. Or see RFC 1925 number 7a.

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread Miklos Szeredi
I'm all for fixing this gc mess that we have now. But please don't expect me to be the one who's doing it. Don't worry, I only expect you to make the situation worse :-) In any event, I'll try to find some time to look more at your patch. But just like you don't want to be

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread Alan Cox
On Mon, 18 Jun 2007 12:43:40 +0200 Thomas Graf [EMAIL PROTECTED] wrote: * Miklos Szeredi [EMAIL PROTECTED] 2007-06-18 12:39 You are wrong. Look in unix_release_sock(): if (atomic_read(unix_tot_inflight)) unix_gc(); /* Garbage collect fds */

Re: [PATCH] fix race in AF_UNIX

2007-06-18 Thread Miklos Szeredi
No, correctness always trumps performance. Lost packets on an AF_UNIX socket are _unexceptable_, and this is definitely not a theoretical problem. If its so unacceptable why has nobody noticed until now - its a bug clearly, it needs fixing clearly, but unless you can produce some kind

Re: [PATCH] fix race in AF_UNIX

2007-06-11 Thread Miklos Szeredi
[CC'd Al Viro and Alan Cox, restored patch] There are races involving the garbage collector, that can throw away perfectly good packets with AF_UNIX sockets in them. The problems arise when a socket goes from installed to in-flight or vice versa during garbage collection. Since gc is

Re: [PATCH] fix race in AF_UNIX

2007-06-07 Thread David Miller
From: Miklos Szeredi [EMAIL PROTECTED] Date: Wed, 06 Jun 2007 10:08:29 +0200 There are races involving the garbage collector, that can throw away perfectly good packets with AF_UNIX sockets in them. The problems arise when a socket goes from installed to in-flight or vice versa during

Re: [PATCH] fix race in AF_UNIX

2007-06-06 Thread Miklos Szeredi
Holding a global mutex over recvmsg() calls under AF_UNIX is pretty much a non-starter, this will kill performance for multi-threaded apps. That's an rwsem held for read. It's held for write in unix_gc() only for a short duration, and unix_gc() should only rarely be called. So

Re: [PATCH] fix race in AF_UNIX

2007-06-06 Thread David Miller
From: Miklos Szeredi [EMAIL PROTECTED] Date: Wed, 06 Jun 2007 10:08:29 +0200 Holding a global mutex over recvmsg() calls under AF_UNIX is pretty much a non-starter, this will kill performance for multi-threaded apps. That's an rwsem held for read. It's held for write in

Re: [PATCH] fix race in AF_UNIX

2007-06-05 Thread David Miller
From: Miklos Szeredi [EMAIL PROTECTED] Date: Mon, 04 Jun 2007 11:45:32 +0200 A recv() on an AF_UNIX, SOCK_STREAM socket can race with a send()+close() on the peer, causing recv() to return zero, even though the sent data should be received. This happens if the send() and the close() is

Re: [PATCH] fix race in AF_UNIX

2007-06-05 Thread Miklos Szeredi
A recv() on an AF_UNIX, SOCK_STREAM socket can race with a send()+close() on the peer, causing recv() to return zero, even though the sent data should be received. This happens if the send() and the close() is performed between skb_dequeue() and checking sk-sk_shutdown in

Re: [PATCH] fix race in AF_UNIX

2007-06-05 Thread David Miller
From: Miklos Szeredi [EMAIL PROTECTED] Date: Tue, 05 Jun 2007 09:42:41 +0200 I just want to win the who's laziest? league. It would take me about 5 minutes to get the netdev tree and test compile the change. Of which 5 seconds would be actually updating the patch. I was thought it was OK to

Re: [PATCH] fix race in AF_UNIX

2007-06-05 Thread David Miller
From: Miklos Szeredi [EMAIL PROTECTED] Date: Tue, 05 Jun 2007 10:11:56 +0200 I just want to win the who's laziest? league. It would take me about 5 minutes to get the netdev tree and test compile the change. Of which 5 seconds would be actually updating the patch. I was thought it

Re: [PATCH] fix race in AF_UNIX

2007-06-05 Thread Miklos Szeredi
I just want to win the who's laziest? league. It would take me about 5 minutes to get the netdev tree and test compile the change. Of which 5 seconds would be actually updating the patch. I was thought it was OK to pass that 5 seconds worth of hard work to you in order to save the rest

Re: [PATCH] fix race in AF_UNIX

2007-06-05 Thread David Miller
From: Miklos Szeredi [EMAIL PROTECTED] Date: Tue, 05 Jun 2007 09:42:41 +0200 From: Miklos Szeredi [EMAIL PROTECTED] A recv() on an AF_UNIX, SOCK_STREAM socket can race with a send()+close() on the peer, causing recv() to return zero, even though the sent data should be received. This

Re: [PATCH] fix race in AF_UNIX

2007-06-05 Thread David Miller
From: David Miller [EMAIL PROTECTED] Date: Tue, 05 Jun 2007 00:02:47 -0700 (PDT) From: Miklos Szeredi [EMAIL PROTECTED] Date: Mon, 04 Jun 2007 11:45:32 +0200 A recv() on an AF_UNIX, SOCK_STREAM socket can race with a send()+close() on the peer, causing recv() to return zero, even though

Re: [PATCH] fix race in AF_UNIX

2007-06-05 Thread Miklos Szeredi
From: Miklos Szeredi [EMAIL PROTECTED] Date: Mon, 04 Jun 2007 11:45:32 +0200 A recv() on an AF_UNIX, SOCK_STREAM socket can race with a send()+close() on the peer, causing recv() to return zero, even though the sent data should be received. This happens if the send()

Re: [PATCH] fix race in AF_UNIX

2007-06-05 Thread David Miller
From: Miklos Szeredi [EMAIL PROTECTED] Date: Wed, 06 Jun 2007 07:26:52 +0200 Holding a global mutex over recvmsg() calls under AF_UNIX is pretty much a non-starter, this will kill performance for multi-threaded apps. That's an rwsem held for read. It's held for write in unix_gc() only

Re: [PATCH] fix race in AF_UNIX

2007-06-04 Thread Miklos Szeredi
A recv() on an AF_UNIX, SOCK_STREAM socket can race with a send()+close() on the peer, causing recv() to return zero, even though the sent data should be received. This happens if the send() and the close() is performed between skb_dequeue() and checking sk-sk_shutdown in

Re: [PATCH] fix race in AF_UNIX

2007-06-04 Thread Miklos Szeredi
(resend, sorry, fscked up the address list) A recv() on an AF_UNIX, SOCK_STREAM socket can race with a send()+close() on the peer, causing recv() to return zero, even though the sent data should be received. This happens if the send() and the close() is performed between skb_dequeue() and

Re: [PATCH] fix race in AF_UNIX

2007-06-02 Thread Arnaldo Carvalho de Melo
On 6/2/07, Miklos Szeredi [EMAIL PROTECTED] wrote: From: Miklos Szeredi [EMAIL PROTECTED] A recv() on an AF_UNIX, SOCK_STREAM socket can race with a send()+close() on the peer, causing recv() to return zero, even though the sent data should be received. This happens if the send() and the