Re: [PATCH, RFC] RCU : OOM avoidance and lower latency
David S. Miller a écrit : Eric, how important do you honestly think the per-hashchain spinlocks are? That's the big barrier from making rt_secret_rebuild() a simple rehash instead of flushing the whole table as it does now. No problem for me in going to a single spinlock. I did the hashed spinlock patch in order to reduce the size of the route hash table and not hurting big NUMA machines. If you think a single spinlock is OK, that's even better ! The lock is only grabbed for updates, and the access to these locks is random and as such probably non-local when taken anyways. Back before we used RCU for reads, this array-of-spinlock thing made a lot more sense. I mean something like this patch: +static DEFINE_SPINLOCK(rt_hash_lock); Just one point : This should be cache_line aligned, and use one full cache line to avoid false sharing at least. (If a cpu takes the lock, no need to invalidate *rt_hash_table for all other cpus) Eric - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] RCU : OOM avoidance and lower latency
From: Eric Dumazet [EMAIL PROTECTED] Date: Sat, 07 Jan 2006 08:53:52 +0100 I have no problem with this, since the biggest server I have is 4 way, but are you sure big machines wont suffer from this single spinlock ? It is the main question. Also I dont understand what you want to do after this single spinlock patch. How is it supposed to help the 'ip route flush cache' problem ? In my case, I have about 600.000 dst-entries : I don't claim to have a solution to this problem currently. Doing RCU and going through the whole DST GC machinery is overkill for an active system. So, perhaps a very simple solution will do: 1) On rt_run_flush(), do not rt_free(), instead collect all active routing cache entries onto a global list, begin a timer to fire in 10 seconds (or some sysctl configurable amount). 2) When a new routing cache entry is needed, check the global list appended to in #1 above first, failing that do dst_alloc() as is done currently. 3) If timer expires, rt_free() any entries in the global list. The missing trick is how to ensure RCU semantics when reallocating from the global list. The idea is that an active system will immediately repopulate itself with all of these entries just flushed from the table. RCU really doesn't handle this kind of problem very well. It truly excels when work is generated by process context work, not interrupt work. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] RCU : OOM avoidance and lower latency
On Sat, Jan 07, 2006 at 12:36:25AM -0800, David S. Miller wrote: From: Eric Dumazet [EMAIL PROTECTED] Date: Sat, 07 Jan 2006 08:53:52 +0100 I have no problem with this, since the biggest server I have is 4 way, but are you sure big machines wont suffer from this single spinlock ? It is the main question. Also I dont understand what you want to do after this single spinlock patch. How is it supposed to help the 'ip route flush cache' problem ? In my case, I have about 600.000 dst-entries : I don't claim to have a solution to this problem currently. Doing RCU and going through the whole DST GC machinery is overkill for an active system. So, perhaps a very simple solution will do: 1) On rt_run_flush(), do not rt_free(), instead collect all active routing cache entries onto a global list, begin a timer to fire in 10 seconds (or some sysctl configurable amount). 2) When a new routing cache entry is needed, check the global list appended to in #1 above first, failing that do dst_alloc() as is done currently. 3) If timer expires, rt_free() any entries in the global list. The missing trick is how to ensure RCU semantics when reallocating from the global list. The straightforward ways of doing this require a per-entry lock in addition to the dst_entry reference count -- lots of read-side overhead. More complex approaches use a generation number that is incremented when adding to or removing from the global list. When the generation number overflows, unconditionally rt_free() it rather than adding to the global list again. Then there needs to be some clever code on the read side to detect the case when the generation number changes while acquiring a reference. And memory barriers. Also lots of read-side overhead. Also, it is now -always- necessary to acquire a reference on the read-side. The idea is that an active system will immediately repopulate itself with all of these entries just flushed from the table. RCU really doesn't handle this kind of problem very well. It truly excels when work is generated by process context work, not interrupt work. Sounds like a challenge to me. ;-) Well, one possible way to attack Eric's workload might be the following: o Size the hash table to strike the appropriate balance between read-side search overhead and memory consumption. Call the number of hash-chain headers N. o Create a hashed array of locks sized to allow the update to proceed sufficiently quickly. Call the number of locks M, probably a power of two. This means that M CPUs can be doing the update in parallel. o Create an array of M^2 list headers (call it xfer[][]), but since this is only needed during an update, it can be allocated and deallocated if need be. (Me, with my big-server experience, would probably just create the array, since M is not likely to be too large. But your mileage may vary. And you really only need M*(M-1) list headers, but that makes the index calculation a bit more annoying.) o Use a two-phase update. In the first phase, each updating CPU acquires the corresponding lock and removes entries from the corresponding partition of the hash table. If the new location of a given entry falls into the same partition, it is added back to the appropriate hash chain of that partition. Otherwise, add the entry to xfer[dst][src], where src and dst are indexes of the corresponding partitions. o When all CPUs finish removing entries from their partition, they check into a barrier. Once all have checked in, they can start the second phase of the update. o In the second phase, each CPU removes the entries from the xfer array that are destined for its partition and adds them to the hash chain that they are destined for. Some commentary and variations, in the hope that this inspires someone to come up with an even better idea: o Unless M is at least three, there is no performance gain over a single global lock with a single CPU doing the update, since each element must now undergo four list operations rather than just two. o The xfer[][] array must have each entry cache-aligned, or you lose big on cacheline effects. Note that it is -not- sufficient to simply align the rows or the columns, since each CPU has its own column when inserting and its own row when removing from xfer[][]. o And the data-skew effects are less severe if this procedure runs from process context. A spinning barrier must be used otherwise. But note that the per-partition locks could remain spinlocks, only the barrier need involve sleeping (in case that helps, am getting a bit ahead of my understanding of this part of the kernel).
Re: [PATCH, RFC] RCU : OOM avoidance and lower latency (Version 2), HOTPLUG_CPU fix
First patch was buggy, sorry :( This 2nd version makes no more RCU assumptions, because only the 'donelist' queue is fetched for an item to be deleted. Items from the donelist are ready to be freed. This V2 also corrects a problem in case of a CPU hotplug, we forgot to update the -count variable when transfering a queue to another one. - In order to avoid some OOM triggered by a flood of call_rcu() calls, we increased in linux 2.6.14 maxbatch from 10 to 1, and conditionally call set_need_resched() in call_rcu(). This solution doesnt solve all the problems and has drawbacks. 1) Using a big maxbatch has a bad impact on latency. 2) A flood of call_rcu_bh() still can OOM I have some servers that once in a while crashes when the ip route cache is flushed. After raising /proc/sys/net/ipv4/route/secret_interval (so that *no* flush is done), I got better uptime for these servers. But in some cases I think the network stack can floods call_rcu_bh(), and a fatal OOM occurs. I suggest in this patch : 1) To lower maxbatch to a more reasonable value (as far as the latency is concerned) 2) To be able to guard a RCU cpu queue against a maximal count (10.000 for example). If this limit is reached, free the oldest entry (if available from the donelist queue). 3) Bug correction in __rcu_offline_cpu() where we forgot to adjust -count field when transfering a queue to another one. In my stress tests, I could not reproduce OOM anymore after applying this patch. Signed-off-by: Eric Dumazet [EMAIL PROTECTED] --- linux-2.6.15/kernel/rcupdate.c 2006-01-03 04:21:10.0 +0100 +++ linux-2.6.15-edum/kernel/rcupdate.c 2006-01-06 13:32:02.0 +0100 @@ -71,14 +71,14 @@ /* Fake initialization required by compiler */ static DEFINE_PER_CPU(struct tasklet_struct, rcu_tasklet) = {NULL}; -static int maxbatch = 1; +static int maxbatch = 100; #ifndef __HAVE_ARCH_CMPXCHG /* * We use an array of spinlocks for the rcurefs -- similar to ones in sparc * 32 bit atomic_t implementations, and a hash function similar to that * for our refcounting needs. - * Can't help multiprocessors which donot have cmpxchg :( + * Can't help multiprocessors which dont have cmpxchg :( */ spinlock_t __rcuref_hash[RCUREF_HASH_SIZE] = { @@ -110,9 +110,19 @@ *rdp-nxttail = head; rdp-nxttail = head-next; - if (unlikely(++rdp-count 1)) - set_need_resched(); - +/* + * OOM avoidance : If we queued too many items in this queue, + * free the oldest entry (from the donelist only to respect + * RCU constraints) + */ + if (unlikely(++rdp-count 1 (head = rdp-donelist))) { + rdp-count--; + rdp-donelist = head-next; + if (!rdp-donelist) + rdp-donetail = rdp-donelist; + local_irq_restore(flags); + return head-func(head); + } local_irq_restore(flags); } @@ -148,12 +158,19 @@ rdp = __get_cpu_var(rcu_bh_data); *rdp-nxttail = head; rdp-nxttail = head-next; - rdp-count++; /* - * Should we directly call rcu_do_batch() here ? - * if (unlikely(rdp-count 1)) - * rcu_do_batch(rdp); + * OOM avoidance : If we queued too many items in this queue, + * free the oldest entry (from the donelist only to respect + * RCU constraints) */ + if (unlikely(++rdp-count 1 (head = rdp-donelist))) { + rdp-count--; + rdp-donelist = head-next; + if (!rdp-donelist) + rdp-donetail = rdp-donelist; + local_irq_restore(flags); + return head-func(head); + } local_irq_restore(flags); } @@ -208,19 +225,20 @@ */ static void rcu_do_batch(struct rcu_data *rdp) { - struct rcu_head *next, *list; - int count = 0; + struct rcu_head *next = NULL, *list; + int count = maxbatch; list = rdp-donelist; while (list) { - next = rdp-donelist = list-next; + next = list-next; list-func(list); list = next; rdp-count--; - if (++count = maxbatch) + if (--count = 0) break; } - if (!rdp-donelist) + rdp-donelist = next; + if (!next) rdp-donetail = rdp-donelist; else tasklet_schedule(per_cpu(rcu_tasklet, rdp-cpu)); @@ -344,11 +362,9 @@ static void rcu_move_batch(struct rcu_data *this_rdp, struct rcu_head *list, struct rcu_head **tail) { - local_irq_disable(); *this_rdp-nxttail = list; if (list) this_rdp-nxttail = tail; - local_irq_enable(); } static void __rcu_offline_cpu(struct rcu_data *this_rdp, @@ -362,9 +378,12 @@ if (rcp-cur != rcp-completed)
Re: [PATCH, RFC] RCU : OOM avoidance and lower latency
On Friday 06 January 2006 11:17, Eric Dumazet wrote: I assume that if a CPU queued 10.000 items in its RCU queue, then the oldest entry cannot still be in use by another CPU. This might sounds as a violation of RCU rules, (I'm not an RCU expert) but seems quite reasonable. I don't think it's a good assumption. Another CPU might be stuck in a long running interrupt, and still have a reference in the code running below the interrupt handler. And in general letting correctness depend on magic numbers like this is very nasty. -Andi - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] RCU : OOM avoidance and lower latency
Andi Kleen a écrit : On Friday 06 January 2006 11:17, Eric Dumazet wrote: I assume that if a CPU queued 10.000 items in its RCU queue, then the oldest entry cannot still be in use by another CPU. This might sounds as a violation of RCU rules, (I'm not an RCU expert) but seems quite reasonable. I don't think it's a good assumption. Another CPU might be stuck in a long running interrupt, and still have a reference in the code running below the interrupt handler. And in general letting correctness depend on magic numbers like this is very nasty. I agree Andi, I posted a 2nd version of the patch with no more assumptions. Eric - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] RCU : OOM avoidance and lower latency
Alan Cox a écrit : On Gwe, 2006-01-06 at 11:17 +0100, Eric Dumazet wrote: I assume that if a CPU queued 10.000 items in its RCU queue, then the oldest entry cannot still be in use by another CPU. This might sounds as a violation of RCU rules, (I'm not an RCU expert) but seems quite reasonable. Fixing the real problem in the routing code would be the real fix. So far nobody succeeded in 'fixing the routing code', few people can even read the code from the first line to the last one... I think this code is not buggy, it only makes general RCU assumptions about delayed freeing of dst entries. In some cases, the general assumptions are just wrong. We can fix it at RCU level, and future users of call_rcu_bh() wont have to think *hard* about 'general assumptions'. Of course, we can ignore the RCU problem and mark somewhere on a sticker: ***DONT USE OR RISK CRASHES*** ***USE IT ONLY FOR FUN*** The underlying problem of RCU and memory usage could be solved more safely by making sure that the sleeping memory allocator path always waits until at least one RCU cleanup has occurred after it fails an allocation before it starts trying harder. That ought to also naturally throttle memory consumers more in the situation which is the right behaviour. In the case of call_rcu_bh(), you can be sure that the caller cannot afford 'sleeping memory allocations'. Better drop a frame than block the stack, no ? Eric - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] RCU : OOM avoidance and lower latency
On Gwe, 2006-01-06 at 15:00 +0100, Eric Dumazet wrote: In the case of call_rcu_bh(), you can be sure that the caller cannot afford 'sleeping memory allocations'. Better drop a frame than block the stack, no ? atomic allocations can't sleep and will fail which is fine. If memory allocation pressure exists for sleeping allocations because of a large rcu backlog we want to be sure that the rcu backlog from the networking stack or other sources does not cause us to OOM kill or take incorrect action. So if for example we want to grow a process stack and the memory is there just stuck in the RCU lists pending recovery we want to let the RCU recovery happen before making drastic decisions. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] RCU : OOM avoidance and lower latency
On Fri, Jan 06, 2006 at 01:37:12PM +, Alan Cox wrote: On Gwe, 2006-01-06 at 11:17 +0100, Eric Dumazet wrote: I assume that if a CPU queued 10.000 items in its RCU queue, then the oldest entry cannot still be in use by another CPU. This might sounds as a violation of RCU rules, (I'm not an RCU expert) but seems quite reasonable. Fixing the real problem in the routing code would be the real fix. The underlying problem of RCU and memory usage could be solved more safely by making sure that the sleeping memory allocator path always waits until at least one RCU cleanup has occurred after it fails an allocation before it starts trying harder. That ought to also naturally throttle memory consumers more in the situation which is the right behaviour. A quick look at rt_garbage_collect() leads me to believe that although the IP route cache does try to limit its use of memory, it does not fully account for memory that it has released to RCU, but that RCU has not yet freed due to a grace period not having elapsed. The following appears to be possible: 1. rt_garbage_collect() sees that there are too many entries, and sets goal to the number to free up, based on a computed equilibrium value. 2. The number of entries is (correctly) decremented only when the corresponding RCU callback is invoked, which actually frees the entry. 3. Between the time that rt_garbage_collect() is invoked the first time and when the RCU grace period ends, rt_garbage_collect() is invoked again. It still sees too many entries (since RCU has not yet freed the ones released by the earlier invocation in step (1) above), so frees a bunch more. 4. Packets routed now miss the route cache, because the corresponding entries are waiting for a grace period, slowing the system down. Therefore, even more entries are freed to make room for new entries corresponding to the new packets. If my (likely quite naive) reading of the IP route cache code is correct, it would be possible to end up in a steady state with most of the entries always being in RCU rather than in the route cache. Eric, could this be what is happening to your system? If it is, one straightforward fix would be to keep a count of the number of route-cache entries waiting on RCU, and for rt_garbage_collect() to subtract this number of entries from its goal. Does this make sense? Thanx, Paul - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] RCU : OOM avoidance and lower latency
Paul E. McKenney a écrit : On Fri, Jan 06, 2006 at 01:37:12PM +, Alan Cox wrote: On Gwe, 2006-01-06 at 11:17 +0100, Eric Dumazet wrote: I assume that if a CPU queued 10.000 items in its RCU queue, then the oldest entry cannot still be in use by another CPU. This might sounds as a violation of RCU rules, (I'm not an RCU expert) but seems quite reasonable. Fixing the real problem in the routing code would be the real fix. The underlying problem of RCU and memory usage could be solved more safely by making sure that the sleeping memory allocator path always waits until at least one RCU cleanup has occurred after it fails an allocation before it starts trying harder. That ought to also naturally throttle memory consumers more in the situation which is the right behaviour. A quick look at rt_garbage_collect() leads me to believe that although the IP route cache does try to limit its use of memory, it does not fully account for memory that it has released to RCU, but that RCU has not yet freed due to a grace period not having elapsed. The following appears to be possible: 1. rt_garbage_collect() sees that there are too many entries, and sets goal to the number to free up, based on a computed equilibrium value. 2. The number of entries is (correctly) decremented only when the corresponding RCU callback is invoked, which actually frees the entry. 3. Between the time that rt_garbage_collect() is invoked the first time and when the RCU grace period ends, rt_garbage_collect() is invoked again. It still sees too many entries (since RCU has not yet freed the ones released by the earlier invocation in step (1) above), so frees a bunch more. 4. Packets routed now miss the route cache, because the corresponding entries are waiting for a grace period, slowing the system down. Therefore, even more entries are freed to make room for new entries corresponding to the new packets. If my (likely quite naive) reading of the IP route cache code is correct, it would be possible to end up in a steady state with most of the entries always being in RCU rather than in the route cache. Eric, could this be what is happening to your system? If it is, one straightforward fix would be to keep a count of the number of route-cache entries waiting on RCU, and for rt_garbage_collect() to subtract this number of entries from its goal. Does this make sense? Hi Paul Thanks for reviewing route code :) As I said, the problem comes from 'route flush cache', that is periodically done by rt_run_flush(), triggered by rt_flush_timer. The 10% of LOWMEM ram that was used by route-cache entries are pushed into rcu queues (with call_rcu_bh()) and network continue to receive packets from *many* sources that want their route-cache entry. Eric - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] RCU : OOM avoidance and lower latency
On Fri, 2006-01-06 at 13:58 +0100, Andi Kleen wrote: Another CPU might be stuck in a long running interrupt Shouldn't a long running interrupt be considered a bug? Lee - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] RCU : OOM avoidance and lower latency
On Fri, 2006-01-06 at 11:17 +0100, Eric Dumazet wrote: I have some servers that once in a while crashes when the ip route cache is flushed. After raising /proc/sys/net/ipv4/route/secret_interval (so that *no* flush is done), I got better uptime for these servers. Argh, where is that documented? I have been banging my head against this for weeks - how do I keep the kernel from flushing 4096 routes at once in softirq context causing huge (~8-20ms) latency problems? I tried all the route related sysctls I could find and nothing worked... Lee - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] RCU : OOM avoidance and lower latency
On Fri, Jan 06, 2006 at 06:19:15PM +0100, Eric Dumazet wrote: Paul E. McKenney a écrit : On Fri, Jan 06, 2006 at 01:37:12PM +, Alan Cox wrote: On Gwe, 2006-01-06 at 11:17 +0100, Eric Dumazet wrote: I assume that if a CPU queued 10.000 items in its RCU queue, then the oldest entry cannot still be in use by another CPU. This might sounds as a violation of RCU rules, (I'm not an RCU expert) but seems quite reasonable. Fixing the real problem in the routing code would be the real fix. The underlying problem of RCU and memory usage could be solved more safely by making sure that the sleeping memory allocator path always waits until at least one RCU cleanup has occurred after it fails an allocation before it starts trying harder. That ought to also naturally throttle memory consumers more in the situation which is the right behaviour. A quick look at rt_garbage_collect() leads me to believe that although the IP route cache does try to limit its use of memory, it does not fully account for memory that it has released to RCU, but that RCU has not yet freed due to a grace period not having elapsed. The following appears to be possible: 1. rt_garbage_collect() sees that there are too many entries, and sets goal to the number to free up, based on a computed equilibrium value. 2. The number of entries is (correctly) decremented only when the corresponding RCU callback is invoked, which actually frees the entry. 3. Between the time that rt_garbage_collect() is invoked the first time and when the RCU grace period ends, rt_garbage_collect() is invoked again. It still sees too many entries (since RCU has not yet freed the ones released by the earlier invocation in step (1) above), so frees a bunch more. 4. Packets routed now miss the route cache, because the corresponding entries are waiting for a grace period, slowing the system down. Therefore, even more entries are freed to make room for new entries corresponding to the new packets. If my (likely quite naive) reading of the IP route cache code is correct, it would be possible to end up in a steady state with most of the entries always being in RCU rather than in the route cache. Eric, could this be what is happening to your system? If it is, one straightforward fix would be to keep a count of the number of route-cache entries waiting on RCU, and for rt_garbage_collect() to subtract this number of entries from its goal. Does this make sense? Hi Paul Thanks for reviewing route code :) As I said, the problem comes from 'route flush cache', that is periodically done by rt_run_flush(), triggered by rt_flush_timer. The 10% of LOWMEM ram that was used by route-cache entries are pushed into rcu queues (with call_rcu_bh()) and network continue to receive packets from *many* sources that want their route-cache entry. Hello, Eric, The rt_run_flush() function could indeed be suffering from the same problem. Dipankar's recent patch should help RCU grace periods proceed more quickly, does that help? If not, it may be worthwhile to limit the number of times that rt_run_flush() runs per RCU grace period. Thanx, Paul - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] RCU : OOM avoidance and lower latency
On Saturday 07 January 2006 01:17, David S. Miller wrote: I mean something like this patch: Looks like a good idea to me. I always disliked the per chain spinlocks even for other hash tables like TCP/UDP multiplex - it would be much nicer to use a much smaller separately hashed lock table and save cache. In this case the special case of using a one entry only lock hash table makes sense. -Andi - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] RCU : OOM avoidance and lower latency
From: Andi Kleen [EMAIL PROTECTED] Date: Sat, 7 Jan 2006 02:09:01 +0100 I always disliked the per chain spinlocks even for other hash tables like TCP/UDP multiplex - it would be much nicer to use a much smaller separately hashed lock table and save cache. In this case the special case of using a one entry only lock hash table makes sense. I used to think they were a great technique. But in each case I thought they could be applied, better schemes have come along. In the case of the page cache we went to a per-address-space tree, and here in the routing cache we went to RCU. There are RCU patches around for the TCP hashes and I'd like to put those in at some point as well. In fact, they'd be even more far reaching since Arnaldo abstracted away the socket hashing stuff into an inet_hashtables subsystem. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] RCU : OOM avoidance and lower latency
Andi Kleen a écrit : I always disliked the per chain spinlocks even for other hash tables like TCP/UDP multiplex - it would be much nicer to use a much smaller separately hashed lock table and save cache. In this case the special case of using a one entry only lock hash table makes sense. I agree, I do use a hashed spinlock array on my local tree for TCP, mainly to reduce the hash table size by a 2 factor. Eric - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] RCU : OOM avoidance and lower latency
From: Eric Dumazet [EMAIL PROTECTED] Date: Sat, 07 Jan 2006 08:34:35 +0100 I agree, I do use a hashed spinlock array on my local tree for TCP, mainly to reduce the hash table size by a 2 factor. So what do you think about going to a single spinlock for the routing cache? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH, RFC] RCU : OOM avoidance and lower latency
David S. Miller a écrit : From: Eric Dumazet [EMAIL PROTECTED] Date: Sat, 07 Jan 2006 08:34:35 +0100 I agree, I do use a hashed spinlock array on my local tree for TCP, mainly to reduce the hash table size by a 2 factor. So what do you think about going to a single spinlock for the routing cache? I have no problem with this, since the biggest server I have is 4 way, but are you sure big machines wont suffer from this single spinlock ? Also I dont understand what you want to do after this single spinlock patch. How is it supposed to help the 'ip route flush cache' problem ? In my case, I have about 600.000 dst-entries : # grep ip_dst /proc/slabinfo ip_dst_cache 616250 622440320 121 : tunables 54 278 : slabdata 51870 51870 0 Eric - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html