Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-10 Thread Dmitry Torokhov
On Thursday 10 February 2005 22:50, David S. Miller wrote: > > > Unlike the above routines, it is required that explicit memory > > > barriers are performed before and after the operation.  It must > > > be done such that all memory operations before and after the > > > atomic operation calls are

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-10 Thread Werner Almesberger
David S. Miller wrote: > Absolutely, I agree. My fingers even itched as I typed those lines > in. I didn't change the wording because I couldn't come up with > anything better. How about something like: Unlike the above routines, atomic_???_return are required to perform memory barriers

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-10 Thread David S. Miller
On Thu, 10 Feb 2005 01:23:04 -0300 Werner Almesberger <[EMAIL PROTECTED]> wrote: > David S. Miller wrote: > > Unlike the above routines, it is required that explicit memory > > barriers are performed before and after the operation. It must > > be done such that all memory operations before and

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-10 Thread David S. Miller
On Thu, 10 Feb 2005 15:56:47 +1100 Herbert Xu <[EMAIL PROTECTED]> wrote: > > ? If yes, is this a good idea ? > > Dave mentioned that on sparc64, atomic_inc_and_test is much more > expensive than the second variant. Actually, besides the memory barriers themselves, all variants are equally

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-10 Thread David S. Miller
On Thu, 10 Feb 2005 15:56:47 +1100 Herbert Xu [EMAIL PROTECTED] wrote: ? If yes, is this a good idea ? Dave mentioned that on sparc64, atomic_inc_and_test is much more expensive than the second variant. Actually, besides the memory barriers themselves, all variants are equally expensive.

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-10 Thread David S. Miller
On Thu, 10 Feb 2005 01:23:04 -0300 Werner Almesberger [EMAIL PROTECTED] wrote: David S. Miller wrote: Unlike the above routines, it is required that explicit memory barriers are performed before and after the operation. It must be done such that all memory operations before and after the

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-10 Thread Werner Almesberger
David S. Miller wrote: Absolutely, I agree. My fingers even itched as I typed those lines in. I didn't change the wording because I couldn't come up with anything better. How about something like: Unlike the above routines, atomic_???_return are required to perform memory barriers [...]

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-10 Thread Dmitry Torokhov
On Thursday 10 February 2005 22:50, David S. Miller wrote: Unlike the above routines, it is required that explicit memory barriers are performed before and after the operation.  It must be done such that all memory operations before and after the atomic operation calls are strongly

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-09 Thread Herbert Xu
On Thu, Feb 10, 2005 at 01:23:04AM -0300, Werner Almesberger wrote: > > What happens if the operation could return a value, but the user > ignores it ? E.g. if I don't like smp_mb__*, could I just use > > atomic_inc_and_test(foo); > > instead of > > smp_mb__before_atomic_inc(); >

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-09 Thread Werner Almesberger
David S. Miller wrote: > This document is intended to serve as a guide to Linux port > maintainers on how to implement atomic counter and bitops operations > properly. Finally, some light is shed into one of the most arcane areas of the kernel ;-) Thanks ! > Unlike the above routines, it

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-09 Thread Werner Almesberger
David S. Miller wrote: This document is intended to serve as a guide to Linux port maintainers on how to implement atomic counter and bitops operations properly. Finally, some light is shed into one of the most arcane areas of the kernel ;-) Thanks ! Unlike the above routines, it is

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-09 Thread Herbert Xu
On Thu, Feb 10, 2005 at 01:23:04AM -0300, Werner Almesberger wrote: What happens if the operation could return a value, but the user ignores it ? E.g. if I don't like smp_mb__*, could I just use atomic_inc_and_test(foo); instead of smp_mb__before_atomic_inc();

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-05 Thread David S. Miller
On Fri, 4 Feb 2005 12:55:39 +1100 Herbert Xu <[EMAIL PROTECTED]> wrote: > OK, here is the patch to do that. Let's get rid of kfree_skb_fast > while we're at it since it's no longer used. > > Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> I've queued this up for 2.6.x and 2.4.x, thanks everyone.

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-05 Thread David S. Miller
On Fri, 4 Feb 2005 12:55:39 +1100 Herbert Xu [EMAIL PROTECTED] wrote: OK, here is the patch to do that. Let's get rid of kfree_skb_fast while we're at it since it's no longer used. Signed-off-by: Herbert Xu [EMAIL PROTECTED] I've queued this up for 2.6.x and 2.4.x, thanks everyone. - To

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-04 Thread Herbert Xu
On Fri, Feb 04, 2005 at 10:24:28PM -0800, David S. Miller wrote: > > Ok, as promised, here is the updated doc. Who should Looks good David. -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key:

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-04 Thread David S. Miller
On Fri, 4 Feb 2005 15:48:55 -0800 "David S. Miller" <[EMAIL PROTECTED]> wrote: > Something like that. I'll update the atomic_ops.txt > doc and post and updated version later tonight. Ok, as promised, here is the updated doc. Who should I author this as? Perhaps "Anton's evil twin" :-) ---

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-04 Thread David S. Miller
On Fri, 4 Feb 2005 22:33:05 +1100 Herbert Xu <[EMAIL PROTECTED]> wrote: > I think you should probably note that some sort of locking or RCU > scheme is required to make this safe. As it is the atomic_inc > and the list_add can be reordered such that the atomic_inc occurs > after the

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-04 Thread Herbert Xu
On Thu, Feb 03, 2005 at 03:08:21PM -0800, David S. Miller wrote: > > Ok, here goes nothing. Can someone run with this? It should > be rather complete, and require only minor editorial work. Thanks. It's a very nice piece of work. > A missing memory barrier in the cases where they are

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-04 Thread Olaf Kirch
On Fri, Feb 04, 2005 at 12:55:39PM +1100, Herbert Xu wrote: > OK, here is the patch to do that. Let's get rid of kfree_skb_fast > while we're at it since it's no longer used. Thanks, I'll give that to the PPC folks and ask the to run with it. Regards, Olaf -- Olaf Kirch | --- o --- Nous

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-04 Thread Olaf Kirch
On Fri, Feb 04, 2005 at 12:55:39PM +1100, Herbert Xu wrote: OK, here is the patch to do that. Let's get rid of kfree_skb_fast while we're at it since it's no longer used. Thanks, I'll give that to the PPC folks and ask the to run with it. Regards, Olaf -- Olaf Kirch | --- o --- Nous

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-04 Thread Herbert Xu
On Thu, Feb 03, 2005 at 03:08:21PM -0800, David S. Miller wrote: Ok, here goes nothing. Can someone run with this? It should be rather complete, and require only minor editorial work. Thanks. It's a very nice piece of work. A missing memory barrier in the cases where they are required

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-04 Thread David S. Miller
On Fri, 4 Feb 2005 22:33:05 +1100 Herbert Xu [EMAIL PROTECTED] wrote: I think you should probably note that some sort of locking or RCU scheme is required to make this safe. As it is the atomic_inc and the list_add can be reordered such that the atomic_inc occurs after the

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-04 Thread David S. Miller
On Fri, 4 Feb 2005 15:48:55 -0800 David S. Miller [EMAIL PROTECTED] wrote: Something like that. I'll update the atomic_ops.txt doc and post and updated version later tonight. Ok, as promised, here is the updated doc. Who should I author this as? Perhaps Anton's evil twin :-) ---

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-04 Thread Herbert Xu
On Fri, Feb 04, 2005 at 10:24:28PM -0800, David S. Miller wrote: Ok, as promised, here is the updated doc. Who should Looks good David. -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED] Home Page: http://gondor.apana.org.au/~herbert/ PGP Key:

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread David S. Miller
On Fri, 4 Feb 2005 10:50:44 +1100 Herbert Xu <[EMAIL PROTECTED]> wrote: > So the problem isn't as big as I thought which is good. sk_buff > is only in trouble because of the atomic_read optimisation which > really needs a memory barrier. > > However, instead of adding a memory barrier which

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread Herbert Xu
On Thu, Feb 03, 2005 at 05:23:57PM -0800, David S. Miller wrote: > > You're absolutely right. Ok, so we do need to change kfree_skb(). > I believe even with the memory barrier, the atomic_read() optimization > is still worth it. atomic ops on sparc64 take a minimum of 40 some odd > cycles on

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread David S. Miller
On Fri, 4 Feb 2005 12:20:53 +1100 Herbert Xu <[EMAIL PROTECTED]> wrote: > This is true if CPU 0 reads the count before reading skb->list. > Without a memory barrier, atomic_read and reading skb->list can > be reordered. Put it another way, reading skb->list could return > a cached value that was

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread Herbert Xu
On Thu, Feb 03, 2005 at 04:49:22PM -0800, David S. Miller wrote: > > If we see the count dropped to "1", whoever set it to "1" made > sure that all outstanding memory operations (including things > like __skb_unlink()) are globally visible before the > atomic_dec_and_test() which put the thing to

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread David S. Miller
On Thu, 3 Feb 2005 22:12:24 +1100 Herbert Xu <[EMAIL PROTECTED]> wrote: > This paradigm is repeated throughout the kernel. I bet the > same race can be found in a lot of those places. So we really > need to sit down and audit them one by one or else come up with > a magical solution apart from

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread Herbert Xu
On Thu, Feb 03, 2005 at 02:19:01PM -0800, David S. Miller wrote: > > They are for cases where you want strict ordering even for the > non-return-value-giving atomic_t ops. I see. I got atomic_dec and atomic_dec_and_test mixed up. So the problem isn't as big as I thought which is good. sk_buff

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread David S. Miller
On Fri, 4 Feb 2005 01:27:05 +1100 Anton Blanchard <[EMAIL PROTECTED]> wrote: > Its difficult stuff, everyone gets it wrong and Andrew keeps > hassling me to write up a document explaining it. Ok, here goes nothing. Can someone run with this? It should be rather complete, and require only minor

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread David S. Miller
On Fri, 4 Feb 2005 07:30:10 +1100 Herbert Xu <[EMAIL PROTECTED]> wrote: > On Fri, Feb 04, 2005 at 01:27:05AM +1100, Anton Blanchard wrote: > > > > Architectures should guarantee that any of the atomics and bitops that > > return values order in both directions. So you dont need the > >

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread Herbert Xu
On Fri, Feb 04, 2005 at 01:27:05AM +1100, Anton Blanchard wrote: > > Architectures should guarantee that any of the atomics and bitops that > return values order in both directions. So you dont need the > smp_mb__before_atomic_dec here. I wasn't aware of this requirement before. However, if

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread David S. Miller
On Fri, 4 Feb 2005 01:27:05 +1100 Anton Blanchard <[EMAIL PROTECTED]> wrote: > Architectures should guarantee that any of the atomics and bitops that > return values order in both directions. So you dont need the > smp_mb__before_atomic_dec here. > > It is, however, required on the atomics and

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread Anton Blanchard
Hi, > For example, in this particular case, a more sinister (but probably > impossible for sk_buff objects) problem would be for the list removal > itself to be delayed until after the the kfree_skb. This could > potentially mean that we're reading/writing memory that's already > been freed. >

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread Herbert Xu
On Wed, Feb 02, 2005 at 04:20:23PM -0800, David S. Miller wrote: > > > if (atomic_read(>users) != 1) { > > smp_mb__before_atomic_dec(); > > if (!atomic_dec_and_test(>users)) > > return; > > } > > __kfree_skb(skb); > > This looks good. Olaf

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread Herbert Xu
On Wed, Feb 02, 2005 at 04:20:23PM -0800, David S. Miller wrote: if (atomic_read(skb-users) != 1) { smp_mb__before_atomic_dec(); if (!atomic_dec_and_test(skb-users)) return; } __kfree_skb(skb); This looks good. Olaf can you

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread Anton Blanchard
Hi, For example, in this particular case, a more sinister (but probably impossible for sk_buff objects) problem would be for the list removal itself to be delayed until after the the kfree_skb. This could potentially mean that we're reading/writing memory that's already been freed.

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread David S. Miller
On Fri, 4 Feb 2005 01:27:05 +1100 Anton Blanchard [EMAIL PROTECTED] wrote: Architectures should guarantee that any of the atomics and bitops that return values order in both directions. So you dont need the smp_mb__before_atomic_dec here. It is, however, required on the atomics and bitops

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread Herbert Xu
On Fri, Feb 04, 2005 at 01:27:05AM +1100, Anton Blanchard wrote: Architectures should guarantee that any of the atomics and bitops that return values order in both directions. So you dont need the smp_mb__before_atomic_dec here. I wasn't aware of this requirement before. However, if this is

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread David S. Miller
On Fri, 4 Feb 2005 07:30:10 +1100 Herbert Xu [EMAIL PROTECTED] wrote: On Fri, Feb 04, 2005 at 01:27:05AM +1100, Anton Blanchard wrote: Architectures should guarantee that any of the atomics and bitops that return values order in both directions. So you dont need the

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread David S. Miller
On Fri, 4 Feb 2005 01:27:05 +1100 Anton Blanchard [EMAIL PROTECTED] wrote: Its difficult stuff, everyone gets it wrong and Andrew keeps hassling me to write up a document explaining it. Ok, here goes nothing. Can someone run with this? It should be rather complete, and require only minor

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread Herbert Xu
On Thu, Feb 03, 2005 at 02:19:01PM -0800, David S. Miller wrote: They are for cases where you want strict ordering even for the non-return-value-giving atomic_t ops. I see. I got atomic_dec and atomic_dec_and_test mixed up. So the problem isn't as big as I thought which is good. sk_buff is

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread David S. Miller
On Thu, 3 Feb 2005 22:12:24 +1100 Herbert Xu [EMAIL PROTECTED] wrote: This paradigm is repeated throughout the kernel. I bet the same race can be found in a lot of those places. So we really need to sit down and audit them one by one or else come up with a magical solution apart from

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread Herbert Xu
On Thu, Feb 03, 2005 at 04:49:22PM -0800, David S. Miller wrote: If we see the count dropped to 1, whoever set it to 1 made sure that all outstanding memory operations (including things like __skb_unlink()) are globally visible before the atomic_dec_and_test() which put the thing to 1 from

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread David S. Miller
On Fri, 4 Feb 2005 12:20:53 +1100 Herbert Xu [EMAIL PROTECTED] wrote: This is true if CPU 0 reads the count before reading skb-list. Without a memory barrier, atomic_read and reading skb-list can be reordered. Put it another way, reading skb-list could return a cached value that was read

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread Herbert Xu
On Thu, Feb 03, 2005 at 05:23:57PM -0800, David S. Miller wrote: You're absolutely right. Ok, so we do need to change kfree_skb(). I believe even with the memory barrier, the atomic_read() optimization is still worth it. atomic ops on sparc64 take a minimum of 40 some odd cycles on

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-03 Thread David S. Miller
On Fri, 4 Feb 2005 10:50:44 +1100 Herbert Xu [EMAIL PROTECTED] wrote: So the problem isn't as big as I thought which is good. sk_buff is only in trouble because of the atomic_read optimisation which really needs a memory barrier. However, instead of adding a memory barrier which makes the

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-02 Thread David S. Miller
On Mon, 31 Jan 2005 22:33:26 +1100 Herbert Xu <[EMAIL PROTECTED]> wrote: > We're using atomic integers to signal that we're done with an object. > The object is usually represented by a piece of memory. > > The problem is that in most of the places where we do this (and that's > not just in the

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-02-02 Thread David S. Miller
On Mon, 31 Jan 2005 22:33:26 +1100 Herbert Xu [EMAIL PROTECTED] wrote: We're using atomic integers to signal that we're done with an object. The object is usually represented by a piece of memory. The problem is that in most of the places where we do this (and that's not just in the

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-01-31 Thread Herbert Xu
Olaf Kirch <[EMAIL PROTECTED]> wrote: > > The problem is that IBM testing was hitting the assertion in kfree_skb > that checks that the skb has been removed from any list it was on > ("kfree_skb passed an skb still on a list"). That must've be some testing to catch this :) > One possible fix

Re: [PATCH] arp_queue: serializing unlink + kfree_skb

2005-01-31 Thread Herbert Xu
Olaf Kirch [EMAIL PROTECTED] wrote: The problem is that IBM testing was hitting the assertion in kfree_skb that checks that the skb has been removed from any list it was on (kfree_skb passed an skb still on a list). That must've be some testing to catch this :) One possible fix here would