Re: [RFC 0/4] [RFC] slub: Fastpath optimization (especially for RT)

2014-10-27 Thread Christoph Lameter
On Mon, 27 Oct 2014, Joonsoo Kim wrote: > > One other aspect of this patchset is that it reduces the cache footprint > > of the alloc and free functions. This typically results in a performance > > increase for the allocator. If we can avoid the page_address() and > > virt_to_head_page() stuff

Re: [RFC 0/4] [RFC] slub: Fastpath optimization (especially for RT)

2014-10-27 Thread Joonsoo Kim
On Fri, Oct 24, 2014 at 09:41:49AM -0500, Christoph Lameter wrote: > > I found that you said retrieving tid first is sufficient to do > > things right in old discussion. :) > > Right but the tid can be obtained from a different processor. > > > One other aspect of this patchset is that it

Re: [RFC 0/4] [RFC] slub: Fastpath optimization (especially for RT)

2014-10-27 Thread Joonsoo Kim
On Fri, Oct 24, 2014 at 09:02:18AM -0500, Christoph Lameter wrote: > On Fri, 24 Oct 2014, Joonsoo Kim wrote: > > > In this case, object from cpu1's cpu_cache should be > > different with cpu0's, so allocation would be failed. > > That is true for most object pointers unless the value is NULL.

Re: [RFC 0/4] [RFC] slub: Fastpath optimization (especially for RT)

2014-10-27 Thread Joonsoo Kim
On Fri, Oct 24, 2014 at 09:02:18AM -0500, Christoph Lameter wrote: On Fri, 24 Oct 2014, Joonsoo Kim wrote: In this case, object from cpu1's cpu_cache should be different with cpu0's, so allocation would be failed. That is true for most object pointers unless the value is NULL. Which it

Re: [RFC 0/4] [RFC] slub: Fastpath optimization (especially for RT)

2014-10-27 Thread Joonsoo Kim
On Fri, Oct 24, 2014 at 09:41:49AM -0500, Christoph Lameter wrote: I found that you said retrieving tid first is sufficient to do things right in old discussion. :) Right but the tid can be obtained from a different processor. One other aspect of this patchset is that it reduces the

Re: [RFC 0/4] [RFC] slub: Fastpath optimization (especially for RT)

2014-10-27 Thread Christoph Lameter
On Mon, 27 Oct 2014, Joonsoo Kim wrote: One other aspect of this patchset is that it reduces the cache footprint of the alloc and free functions. This typically results in a performance increase for the allocator. If we can avoid the page_address() and virt_to_head_page() stuff that is

Re: [RFC 0/4] [RFC] slub: Fastpath optimization (especially for RT)

2014-10-24 Thread Christoph Lameter
> I found that you said retrieving tid first is sufficient to do > things right in old discussion. :) Right but the tid can be obtained from a different processor. One other aspect of this patchset is that it reduces the cache footprint of the alloc and free functions. This typically results in

Re: [RFC 0/4] [RFC] slub: Fastpath optimization (especially for RT)

2014-10-24 Thread Christoph Lameter
On Fri, 24 Oct 2014, Joonsoo Kim wrote: > In this case, object from cpu1's cpu_cache should be > different with cpu0's, so allocation would be failed. That is true for most object pointers unless the value is NULL. Which it can be. But if this is the only case then the second patch + your

Re: [RFC 0/4] [RFC] slub: Fastpath optimization (especially for RT)

2014-10-24 Thread Christoph Lameter
On Fri, 24 Oct 2014, Joonsoo Kim wrote: In this case, object from cpu1's cpu_cache should be different with cpu0's, so allocation would be failed. That is true for most object pointers unless the value is NULL. Which it can be. But if this is the only case then the second patch + your approach

Re: [RFC 0/4] [RFC] slub: Fastpath optimization (especially for RT)

2014-10-24 Thread Christoph Lameter
I found that you said retrieving tid first is sufficient to do things right in old discussion. :) Right but the tid can be obtained from a different processor. One other aspect of this patchset is that it reduces the cache footprint of the alloc and free functions. This typically results in a

Re: [RFC 0/4] [RFC] slub: Fastpath optimization (especially for RT)

2014-10-23 Thread Joonsoo Kim
On Thu, Oct 23, 2014 at 09:18:29AM -0500, Christoph Lameter wrote: > On Thu, 23 Oct 2014, Joonsoo Kim wrote: > > > Preemption disable during very short code would cause large problem for RT? > > This is the hotpath and preempt enable/disable adds a significant number > of cycles. > > > And, if

Re: [RFC 0/4] [RFC] slub: Fastpath optimization (especially for RT)

2014-10-23 Thread Christoph Lameter
On Thu, 23 Oct 2014, Joonsoo Kim wrote: > Preemption disable during very short code would cause large problem for RT? This is the hotpath and preempt enable/disable adds a significant number of cycles. > And, if page_address() and virt_to_head_page() remain as current patchset > implementation,

Re: [RFC 0/4] [RFC] slub: Fastpath optimization (especially for RT)

2014-10-23 Thread Joonsoo Kim
On Wed, Oct 22, 2014 at 10:55:17AM -0500, Christoph Lameter wrote: > We had to insert a preempt enable/disable in the fastpath a while ago. This > was mainly due to a lot of state that is kept to be allocating from the per > cpu freelist. In particular the page field is not covered by >

Re: [RFC 0/4] [RFC] slub: Fastpath optimization (especially for RT)

2014-10-23 Thread Joonsoo Kim
On Wed, Oct 22, 2014 at 10:55:17AM -0500, Christoph Lameter wrote: We had to insert a preempt enable/disable in the fastpath a while ago. This was mainly due to a lot of state that is kept to be allocating from the per cpu freelist. In particular the page field is not covered by

Re: [RFC 0/4] [RFC] slub: Fastpath optimization (especially for RT)

2014-10-23 Thread Christoph Lameter
On Thu, 23 Oct 2014, Joonsoo Kim wrote: Preemption disable during very short code would cause large problem for RT? This is the hotpath and preempt enable/disable adds a significant number of cycles. And, if page_address() and virt_to_head_page() remain as current patchset implementation,

Re: [RFC 0/4] [RFC] slub: Fastpath optimization (especially for RT)

2014-10-23 Thread Joonsoo Kim
On Thu, Oct 23, 2014 at 09:18:29AM -0500, Christoph Lameter wrote: On Thu, 23 Oct 2014, Joonsoo Kim wrote: Preemption disable during very short code would cause large problem for RT? This is the hotpath and preempt enable/disable adds a significant number of cycles. And, if

[RFC 0/4] [RFC] slub: Fastpath optimization (especially for RT)

2014-10-22 Thread Christoph Lameter
We had to insert a preempt enable/disable in the fastpath a while ago. This was mainly due to a lot of state that is kept to be allocating from the per cpu freelist. In particular the page field is not covered by this_cpu_cmpxchg used in the fastpath to do the necessary atomic state change for

[RFC 0/4] [RFC] slub: Fastpath optimization (especially for RT)

2014-10-22 Thread Christoph Lameter
We had to insert a preempt enable/disable in the fastpath a while ago. This was mainly due to a lot of state that is kept to be allocating from the per cpu freelist. In particular the page field is not covered by this_cpu_cmpxchg used in the fastpath to do the necessary atomic state change for