* Christoph Lameter ([EMAIL PROTECTED]) wrote: > On Tue, 21 Aug 2007, Mathieu Desnoyers wrote: > > > - Fixed an erroneous test in slab_free() (logic was flipped from the > > original code when testing for slow path. It explains the wrong > > numbers you have with big free). > > If you look at the numbers that I posted earlier then you will see that > even the measurements without free were not up to par. >
I seem to get a clear performance improvement in the kmalloc fast path. > > It applies on top of the > > "SLUB Use cmpxchg() everywhere" patch. > > Which one is that? > This one: SLUB Use cmpxchg() everywhere. It applies to "SLUB: Single atomic instruction alloc/free using cmpxchg". Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]> --- mm/slub.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: slab/mm/slub.c =================================================================== --- slab.orig/mm/slub.c 2007-08-20 18:42:16.000000000 -0400 +++ slab/mm/slub.c 2007-08-20 18:42:28.000000000 -0400 @@ -1682,7 +1682,7 @@ redo: object[c->offset] = freelist; - if (unlikely(cmpxchg_local(&c->freelist, freelist, object) != freelist)) + if (unlikely(cmpxchg(&c->freelist, freelist, object) != freelist)) goto redo; return; slow: > > | slab.git HEAD slub (min-max) | cmpxchg_local slub > > kmalloc(8) | 190 - 201 | 83 > > kfree(8) | 351 - 351 | 363 > > kmalloc(64) | 224 - 245 | 115 > > kfree(64) | 389 - 394 | 397 > > kmalloc(16384)| 713 - 741 | 724 > > kfree(16384) | 843 - 856 | 843 > > > > Therefore, there seems to be a repeatable gain on the kmalloc fast path > > (more than twice faster). No significant performance hit for the kfree > > case, but no gain neither, same for large kmalloc, as expected. > > There is a consistent loss on slab_free it seems. The 16k numbers are > irrelevant since we do not use slab_alloc/slab_free due to the direct pass > through patch but call the page allocator directly. That also explains > that there is no loss there. > Yes. slab_free in these tests falls mostly into __slab_free() slow path (I instrumented the number of slow and fast path to get this). The small performance hit (~10 cycles) can be explained by the added preempt_disable()/preempt_enable(). > The kmalloc numbers look encouraging. I will check to see if I can > reproduce it once I sort out the patches. Ok. -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/