Re: [RFC PATCH 0/3] Network stack, first user of SLAB/kmem_cache bulk free API.

2015-09-09 Thread Christoph Lameter
On Wed, 9 Sep 2015, Jesper Dangaard Brouer wrote: > > Hmmm... Guess we need to come up with distinct version of kmalloc() for > > irq and non irq contexts to take advantage of that . Most at non irq > > context anyways. > > I agree, it would be an easy win. Do notice this will have the most > imp

Re: [RFC PATCH 0/3] Network stack, first user of SLAB/kmem_cache bulk free API.

2015-09-09 Thread Jesper Dangaard Brouer
On Tue, 8 Sep 2015 12:32:40 -0500 (CDT) Christoph Lameter wrote: > On Sat, 5 Sep 2015, Jesper Dangaard Brouer wrote: > > > The double_cmpxchg without lock prefix still cost 9 cycles, which is > > very fast but still a cost (add approx 19 cycles for a lock prefix). > > > > It is slower than local

Re: [RFC PATCH 0/3] Network stack, first user of SLAB/kmem_cache bulk free API.

2015-09-08 Thread Christoph Lameter
On Sat, 5 Sep 2015, Jesper Dangaard Brouer wrote: > The double_cmpxchg without lock prefix still cost 9 cycles, which is > very fast but still a cost (add approx 19 cycles for a lock prefix). > > It is slower than local_irq_disable + local_irq_enable that only cost > 7 cycles, which the bulking ca

Re: [RFC PATCH 0/3] Network stack, first user of SLAB/kmem_cache bulk free API.

2015-09-07 Thread Alexander Duyck
On 09/07/2015 01:16 AM, Jesper Dangaard Brouer wrote: On Fri, 4 Sep 2015 11:09:21 -0700 Alexander Duyck wrote: This is an interesting start. However I feel like it might work better if you were to create a per-cpu pool for skbs that could be freed and allocated in NAPI context. So for exampl

Re: [RFC PATCH 0/3] Network stack, first user of SLAB/kmem_cache bulk free API.

2015-09-07 Thread Jesper Dangaard Brouer
On Fri, 4 Sep 2015 11:09:21 -0700 Alexander Duyck wrote: > This is an interesting start. However I feel like it might work better > if you were to create a per-cpu pool for skbs that could be freed and > allocated in NAPI context. So for example we already have > napi_alloc_skb, why not just

Re: [RFC PATCH 0/3] Network stack, first user of SLAB/kmem_cache bulk free API.

2015-09-05 Thread Jesper Dangaard Brouer
On Fri, 4 Sep 2015 18:45:13 -0500 (CDT) Christoph Lameter wrote: > On Fri, 4 Sep 2015, Alexander Duyck wrote: > > Right, but one of the reasons for Jesper to implement the bulk alloc/free is > > to avoid the cmpxchg that is being used to get stuff into or off of the per > > cpu lists. > > There

Re: [RFC PATCH 0/3] Network stack, first user of SLAB/kmem_cache bulk free API.

2015-09-04 Thread Christoph Lameter
On Fri, 4 Sep 2015, Alexander Duyck wrote: > Right, but one of the reasons for Jesper to implement the bulk alloc/free is > to avoid the cmpxchg that is being used to get stuff into or off of the per > cpu lists. There is no full cmpxchg used for the per cpu lists. Its a cmpxchg without lock seman

Re: [RFC PATCH 0/3] Network stack, first user of SLAB/kmem_cache bulk free API.

2015-09-04 Thread Alexander Duyck
On 09/04/2015 11:55 AM, Christoph Lameter wrote: On Fri, 4 Sep 2015, Alexander Duyck wrote: were to create a per-cpu pool for skbs that could be freed and allocated in NAPI context. So for example we already have napi_alloc_skb, why not just add a napi_free_skb and then make the array of objec

Re: [RFC PATCH 0/3] Network stack, first user of SLAB/kmem_cache bulk free API.

2015-09-04 Thread Christoph Lameter
On Fri, 4 Sep 2015, Alexander Duyck wrote: > were to create a per-cpu pool for skbs that could be freed and allocated in > NAPI context. So for example we already have napi_alloc_skb, why not just add > a napi_free_skb and then make the array of objects to be freed part of a pool > that could be

Re: [RFC PATCH 0/3] Network stack, first user of SLAB/kmem_cache bulk free API.

2015-09-04 Thread Alexander Duyck
On 09/04/2015 10:00 AM, Jesper Dangaard Brouer wrote: During TX DMA completion cleanup there exist an opportunity in the NIC drivers to perform bulk free, without introducing additional latency. For an IPv4 forwarding workload the network stack is hitting the slowpath of the kmem_cache "slub" al

[RFC PATCH 0/3] Network stack, first user of SLAB/kmem_cache bulk free API.

2015-09-04 Thread Jesper Dangaard Brouer
During TX DMA completion cleanup there exist an opportunity in the NIC drivers to perform bulk free, without introducing additional latency. For an IPv4 forwarding workload the network stack is hitting the slowpath of the kmem_cache "slub" allocator. This slowpath can be mitigated by bulk free vi