On Wed, Nov 22, 2006 at 04:02:29PM +0100, Eric Dumazet wrote: > On some workloads, (for example when lot of close() syscalls are done), RCU > qlen can be quite large, and RCU heads are no longer in cpu cache when > rcu_do_batch() is called. > > This patches adds a prefetch() in rcu_do_batch() to give CPU a hint to bring > back cache lines containing 'struct rcu_head's. > > Most list manipulations macros include prefetch(), but not open coded ones (at > least with current C compilers :) ) > > I got a nice speedup on a trivial benchmark (3.48 us per iteration instead of > 3.95 us on a 1.6 GHz Pentium-M) > while (1) { pipe(p); close(fd[0]); close(fd[1]);}
Interesting! How much of the speedup was due to the prefetch() and how much to removing the extra store to rdp->donelist? Thanx, Paul > Signed-off-by: Eric Dumazet <[EMAIL PROTECTED]> > --- linux-2.6.19-rc6/kernel/rcupdate.c 2006-11-16 05:03:40.000000000 > +0100 > +++ linux-2.6.19-rc6-ed/kernel/rcupdate.c 2006-11-22 15:12:09.000000000 > +0100 > @@ -235,12 +235,14 @@ static void rcu_do_batch(struct rcu_data > > list = rdp->donelist; > while (list) { > - next = rdp->donelist = list->next; > + next = list->next; > + prefetch(next); > list->func(list); > list = next; > if (++count >= rdp->blimit) > break; > } > + rdp->donelist = list; > > local_irq_disable(); > rdp->qlen -= count; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/