Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-09 Thread Christoph Lameter
On Fri, 9 Mar 2007, Mel Gorman wrote: > The results without slub_debug were not good except for IA64. x86_64 and ppc64 > both blew up for a variety of reasons. The IA64 results were Yuck that is the dst issue that Adrian is also looking at. Likely an issue with slab merging and RCU frees. >

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-09 Thread Christoph Lameter
On Fri, 9 Mar 2007, Mel Gorman wrote: > I'm not sure what you mean by per-order queues. The buddy allocator already > has per-order lists. Somehow they do not seem to work right. SLAB (and now SLUB too) can avoid (or defer) fragmentation by keeping its own queues. - To unsubscribe from this

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-09 Thread Mel Gorman
Note that I am amazed that the kernbench even worked. The results without slub_debug were not good except for IA64. x86_64 and ppc64 both blew up for a variety of reasons. The IA64 results were KernBench Comparison 2.6.21-rc2-mm2-clean

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-09 Thread Mel Gorman
On Thu, 8 Mar 2007, Christoph Lameter wrote: Note that I am amazed that the kernbench even worked. On small machine How small? The machines I am testing on aren't "big" but they aren't misterable either. I seem to be getting into trouble with order 1 allocations. That in itself is

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-09 Thread Mel Gorman
On Thu, 8 Mar 2007, Christoph Lameter wrote: On Thu, 8 Mar 2007, Mel Gorman wrote: Note that the 16kb page size has a major impact on SLUB performance. On IA64 slub will use only 1/4th the locking overhead as on 4kb platforms. It'll be interesting to see the kernbench tests then with

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-09 Thread Mel Gorman
On Thu, 8 Mar 2007, Christoph Lameter wrote: On Thu, 8 Mar 2007, Mel Gorman wrote: Note that the 16kb page size has a major impact on SLUB performance. On IA64 slub will use only 1/4th the locking overhead as on 4kb platforms. It'll be interesting to see the kernbench tests then with

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-09 Thread Mel Gorman
On Thu, 8 Mar 2007, Christoph Lameter wrote: Note that I am amazed that the kernbench even worked. On small machine How small? The machines I am testing on aren't big but they aren't misterable either. I seem to be getting into trouble with order 1 allocations. That in itself is pretty

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-09 Thread Mel Gorman
Note that I am amazed that the kernbench even worked. The results without slub_debug were not good except for IA64. x86_64 and ppc64 both blew up for a variety of reasons. The IA64 results were KernBench Comparison 2.6.21-rc2-mm2-clean

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-09 Thread Christoph Lameter
On Fri, 9 Mar 2007, Mel Gorman wrote: I'm not sure what you mean by per-order queues. The buddy allocator already has per-order lists. Somehow they do not seem to work right. SLAB (and now SLUB too) can avoid (or defer) fragmentation by keeping its own queues. - To unsubscribe from this list:

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-09 Thread Christoph Lameter
On Fri, 9 Mar 2007, Mel Gorman wrote: The results without slub_debug were not good except for IA64. x86_64 and ppc64 both blew up for a variety of reasons. The IA64 results were Yuck that is the dst issue that Adrian is also looking at. Likely an issue with slab merging and RCU frees.

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-08 Thread Christoph Lameter
Note that I am amazed that the kernbench even worked. On small machine I seem to be getting into trouble with order 1 allocations. SLAB seems to be able to avoid the situation by keeping higher order pages on a freelist and reduce the alloc/frees of higher order pages that the page allocator

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-08 Thread Christoph Lameter
On Thu, 8 Mar 2007, Mel Gorman wrote: > > Note that the 16kb page size has a major > > impact on SLUB performance. On IA64 slub will use only 1/4th the locking > > overhead as on 4kb platforms. > It'll be interesting to see the kernbench tests then with debugging > disabled. You can get a

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-08 Thread Christoph Lameter
On Thu, 8 Mar 2007, Mel Gorman wrote: > Brought up 4 CPUs > Node 0 CPUs: 0-3 > mm/memory.c:111: bad pud c50e4480. Lower bits must be clear right? Looks like the pud was released and then reused for a 64 byte cache or so. This is likely a freelist pointer that slub put there after

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-08 Thread Mel Gorman
On (08/03/07 08:48), Christoph Lameter didst pronounce: > On Thu, 8 Mar 2007, Mel Gorman wrote: > > > On x86_64, it completed successfully and looked reliable. There was a 5% > > performance loss on kernbench and aim9 figures were way down. However, with > > slub_debug enabled, I would expect

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-08 Thread Christoph Lameter
On Thu, 8 Mar 2007, Mel Gorman wrote: > On x86_64, it completed successfully and looked reliable. There was a 5% > performance loss on kernbench and aim9 figures were way down. However, with > slub_debug enabled, I would expect that so it's not a fair comparison > performance wise. I'll rerun the

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-08 Thread Mel Gorman
On Tue, 6 Mar 2007, Christoph Lameter wrote: [PATCH] SLUB The unqueued slab allocator v4 Hi Christoph, I shoved these patches through a few tests on x86, x86_64, ia64 and ppc64 last night to see how they got on. I enabled slub_debug to catch any suprises that may be creeping about. The

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-08 Thread Christoph Lameter
On Thu, 8 Mar 2007, Mel Gorman wrote: On x86_64, it completed successfully and looked reliable. There was a 5% performance loss on kernbench and aim9 figures were way down. However, with slub_debug enabled, I would expect that so it's not a fair comparison performance wise. I'll rerun the

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-08 Thread Mel Gorman
On (08/03/07 08:48), Christoph Lameter didst pronounce: On Thu, 8 Mar 2007, Mel Gorman wrote: On x86_64, it completed successfully and looked reliable. There was a 5% performance loss on kernbench and aim9 figures were way down. However, with slub_debug enabled, I would expect that so

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-08 Thread Christoph Lameter
On Thu, 8 Mar 2007, Mel Gorman wrote: Brought up 4 CPUs Node 0 CPUs: 0-3 mm/memory.c:111: bad pud c50e4480. Lower bits must be clear right? Looks like the pud was released and then reused for a 64 byte cache or so. This is likely a freelist pointer that slub put there after

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-08 Thread Christoph Lameter
On Thu, 8 Mar 2007, Mel Gorman wrote: Note that the 16kb page size has a major impact on SLUB performance. On IA64 slub will use only 1/4th the locking overhead as on 4kb platforms. It'll be interesting to see the kernbench tests then with debugging disabled. You can get a similar

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-08 Thread Christoph Lameter
Note that I am amazed that the kernbench even worked. On small machine I seem to be getting into trouble with order 1 allocations. SLAB seems to be able to avoid the situation by keeping higher order pages on a freelist and reduce the alloc/frees of higher order pages that the page allocator

Re: [SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-08 Thread Mel Gorman
On Tue, 6 Mar 2007, Christoph Lameter wrote: [PATCH] SLUB The unqueued slab allocator v4 Hi Christoph, I shoved these patches through a few tests on x86, x86_64, ia64 and ppc64 last night to see how they got on. I enabled slub_debug to catch any suprises that may be creeping about. The

[SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-06 Thread Christoph Lameter
[PATCH] SLUB The unqueued slab allocator v4 V3->V4 - Rename /proc/slabinfo to /proc/slubinfo. We have a different format after all. - More bug fixes and stabilization of diagnostic functions. This seems to be finally something that works wherever we test it. - Serialize kmem_cache_create and

[SLUB 0/3] SLUB: The unqueued slab allocator V4

2007-03-06 Thread Christoph Lameter
[PATCH] SLUB The unqueued slab allocator v4 V3-V4 - Rename /proc/slabinfo to /proc/slubinfo. We have a different format after all. - More bug fixes and stabilization of diagnostic functions. This seems to be finally something that works wherever we test it. - Serialize kmem_cache_create and