(bulk:2048)
90 cycles(tsc) 22.585 ns (bulk:4096)
[1]
https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/slab_bulk_test01.c
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
mm/slab.c | 87 +++--
On Mon, 7 Sep 2015 13:22:13 -0700
Linus Torvalds <torva...@linux-foundation.org> wrote:
> On Mon, Sep 7, 2015 at 2:30 AM, Jesper Dangaard Brouer
> <bro...@redhat.com> wrote:
> >
> > The slub allocator have a faster "fastpath", if your workload is
> >
an_tx_irq() 373 ns.
At 10Gbit/s how many bytes can arrive in this period, only: 466 bytes.
((373/10^9)*(1*10^6)/8)
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Sr. Network Kernel Developer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/br
On Tue, 8 Sep 2015 12:32:40 -0500 (CDT)
Christoph Lameter <c...@linux.com> wrote:
> On Sat, 5 Sep 2015, Jesper Dangaard Brouer wrote:
>
> > The double_cmpxchg without lock prefix still cost 9 cycles, which is
> > very fast but still a cost (add approx 19
the API of always returning the exact number of requested
objects will not work...
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Sr. Network Kernel Developer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
(related to http://t
On Wed, 16 Sep 2015 10:13:25 -0500 (CDT)
Christoph Lameter <c...@linux.com> wrote:
> On Wed, 16 Sep 2015, Jesper Dangaard Brouer wrote:
>
> >
> > Hint, this leads up to discussing if current bulk *ALLOC* API need to
> > be changed...
> >
> > Alex and
On Mon, 28 Sep 2015 11:30:00 -0500 (CDT) Christoph Lameter <c...@linux.com>
wrote:
> On Mon, 28 Sep 2015, Jesper Dangaard Brouer wrote:
>
> > Not knowing SLUB as well as you, it took me several hours to realize
> > init_object() didn't overwrite the freepointer
On Mon, 28 Sep 2015 11:28:15 -0500 (CDT)
Christoph Lameter <c...@linux.com> wrote:
> On Mon, 28 Sep 2015, Jesper Dangaard Brouer wrote:
>
> > > Do you really need separate parameters for freelist_head? If you just want
> > > to deal with one object pass it as
The #ifdef of CONFIG_SLUB_DEBUG is located very far from
the associated #else. For readability mark it with a comment.
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
Acked-by: Christoph Lameter <c...@linux.com>
---
mm/slub.c |2 +-
1 file changed, 1 insertion(+)
soon as I've cleaned it up, rebased it
on net-next and re-run all the benchmarks.
---
Christoph Lameter (2):
slub: create new ___slab_alloc function that can be called with irqs
disabled
slub: Avoid irqoff/on in bulk allocation
Jesper Dangaard Brouer (4):
slub: mark the da
https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/slab_bulk_test01.c
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.du...@redhat.com>
Acked-by: Christoph Lameter <c.
On Tue, 29 Sep 2015 09:38:30 -0700
Alexander Duyck <alexander.du...@gmail.com> wrote:
> On 09/29/2015 08:48 AM, Jesper Dangaard Brouer wrote:
> > Make it possible to free a freelist with several objects by adjusting
> > API of slab_free() and __slab_free() to have head
://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/slab_bulk_test01.c
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
Acked-by: Christoph Lameter <c...@linux.com>
---
mm/slab.c | 87 +++--
1 file changed, 6
no performance reduction due to this change,
when debugging is turned off (compiled with CONFIG_SLUB_DEBUG).
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.du...@redhat.com>
---
V4:
- Change API per req of Christoph Lameter
- Rem
ter <c...@linux.com>
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
mm/slub.c | 24 +++-
1 file changed, 11 insertions(+), 13 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 02cfb3a5983e..024eed32da2c 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@
which promptly disables them again using the expensive
local_irq_save().
Signed-off-by: Christoph Lameter <c...@linux.com>
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
mm/slub.c | 44 +---
1 file changed, 29 insertions(+), 15
no performance reduction due to this change,
when debugging is turned off (compiled with CONFIG_SLUB_DEBUG).
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.du...@redhat.com>
---
V4:
- Change API per req of Christoph Lameter
- Rem
On Mon, 28 Sep 2015 10:16:49 -0500 (CDT)
Christoph Lameter <c...@linux.com> wrote:
> On Mon, 28 Sep 2015, Jesper Dangaard Brouer wrote:
>
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 1cf98d89546d..13b5f53e4840 100644
> > --- a/mm/slub.c
> > +++ b/mm/slu
On Mon, 28 Sep 2015 07:53:16 -0700 Alexander Duyck <alexander.du...@gmail.com>
wrote:
> On 09/28/2015 05:26 AM, Jesper Dangaard Brouer wrote:
> > For practical use-cases it is beneficial to prefetch the next freelist
> > object in bulk allocation loop.
> >
> >
On Fri, 2 Oct 2015 11:41:18 +0200
Jesper Dangaard Brouer <bro...@redhat.com> wrote:
> On Thu, 1 Oct 2015 15:10:15 -0700
> Andrew Morton <a...@linux-foundation.org> wrote:
>
> > On Wed, 30 Sep 2015 13:44:19 +0200 Jesper Dangaard Brouer
> > <bro...@redhat.com
On Fri, 2 Oct 2015 05:10:02 -0500 (CDT)
Christoph Lameter <c...@linux.com> wrote:
> On Fri, 2 Oct 2015, Jesper Dangaard Brouer wrote:
>
> > Thus, I need introducing new code like this patch and at the same time
> > have to reduce the number of instruction-cache misses/usa
On Thu, 1 Oct 2015 15:10:15 -0700
Andrew Morton <a...@linux-foundation.org> wrote:
> On Wed, 30 Sep 2015 13:44:19 +0200 Jesper Dangaard Brouer <bro...@redhat.com>
> wrote:
>
> > Make it possible to free a freelist with several objects by adjusting
> > A
n git
tree "net" by commit 31b33dfb0a14 ("skbuff: Fix skb checksum partial
check.").
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
keywords:
which promptly disables them again using the expensive
local_irq_save().
Signed-off-by: Christoph Lameter <c...@linux.com>
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
mm/slub.c | 44 +---
1 file changed, 29 insertions(+), 15
mem_cache_alloc_bulk()
mm/slab.c | 87 ++-
mm/slub.c | 276 +
2 files changed, 267 insertions(+), 96 deletions(-)
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Autho
ter <c...@linux.com>
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
mm/slub.c | 24 +++-
1 file changed, 11 insertions(+), 13 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 02cfb3a5983e..024eed32da2c 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@
cycles(tsc) - 27 cycles(tsc) - increase in cycles:0
158 - 30 cycles(tsc) - 30 cycles(tsc) - increase in cycles:0
250 - 37 cycles(tsc) - 37 cycles(tsc) - increase in cycles:0
Note, benchmark done with slab_nomerge to keep it stable enough
for accurate comparison.
Signed-off-by: Jesper Dangaard
The #ifdef of CONFIG_SLUB_DEBUG is located very far from
the associated #else. For readability mark it with a comment.
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
mm/slub.c |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/slub.c b/mm/slub.c
(bulk:2048)
90 cycles(tsc) 22.585 ns (bulk:4096)
[1]
https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/slab_bulk_test01.c
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
mm/slab.c | 87 +++--
Dangaard Brouer <bro...@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.du...@redhat.com>
---
mm/slub.c | 97 +
1 file changed, 84 insertions(+), 13 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index
https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/slab_bulk_test01.c
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.du...@redhat.com>
---
mm/slub.c | 109
Dangaard Brouer <bro...@redhat.com>
---
net/ethernet/eth.c |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ethernet/eth.c b/net/ethernet/eth.c
index d850fdc828f9..9e63f252a89e 100644
--- a/net/ethernet/eth.c
+++ b/net/ethernet/eth.c
@@ -127,7 +127,7 @@ u32 eth_get_h
On Tue, 29 Sep 2015 10:20:20 -0700
Alexander Duyck <alexander.du...@gmail.com> wrote:
> On 09/29/2015 10:00 AM, Jesper Dangaard Brouer wrote:
> > On Tue, 29 Sep 2015 09:38:30 -0700
> > Alexander Duyck <alexander.du...@gmail.com> wrote:
> >
> >> On 09/29/2
mpling happens. So you may
> have large skid and the sampling points may be far away. Skylake has new
> special FRONTEND_* PEBS events for this, but before it was often difficult.
This testlab CPU is i7-4790K @ 4.00GHz. Maybe I should get a Skylake...
p.s. thanks for your pmu-tools[1], eve
nt_size(hopt->rate.mpu, b2));
> + fprintf(f, "cburst %s/%u mpu %s ",
> sprint_size(cbuffer, b1),
> 1<ceil.cell_log,
> - sprint_size(hopt->ceil.mpu&0xFF, b2)
is point the
> skb has already been popped off the qdisc so it has to be handled
> by the infrastructure.
I generally like this idea of resolving this per cpu. (I stalled here,
on the requeue issue, last time I implemented a lockless qdisc
approach).
--
Best regards,
Jesper Dangaard Broue
On Fri, 18 Dec 2015 16:16:39 +0300
Dmitrii Shcherbakov <fw.dmit...@yandex.com> wrote:
> b3 buffer has been deleted previously so b2 is followed by b4 which is not
> consistent
>
> Signed-off-by: Dmitrii Shcherbakov <fw.dmit...@yandex.com>
> ---
Acked-by:
overhead' field in the ratespec structure has been
> introduced.
>
> Signed-off-by: Dmitrii Shcherbakov <fw.dmit...@yandex.com>
> ---
Acked-by: Jesper Dangaard Brouer <bro...@redhat.com>
Thank you Dmitrii for cleaning this up :-)
--
Best regards,
Jesper Dangaard Brouer
MSc.C
))
__dev_kfree_skb_irq(skb, reason);
else
dev_kfree_skb(skb);
}
> --
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
> Foundation Collaborative Project
--
Best reg
> +
> +static inline int skb_array_peek_len(struct skb_array *a)
> +{
> + return PTR_RING_PEEK_CALL(>ring, __skb_array_len_with_tag);
> +}
> +
[...]
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
array_XXX APIs with a spinlock,
> so this should not be an issue for them.
I would like to see some bulking support...
As my experiments[1] show that alf_queue (primarily) can beat skb_array due
to bulking support. It seems like an obvious optimization for the virt
tun use-case to bulk dequeue SK
in_lock_bh()
> + ptr = __ptr_ring_consume(r);
> + spin_unlock(>consumer_lock);
and spin_unlock_bh()
> +
> + return ptr;
> +}
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
On Thu, 2 Jun 2016 20:47:25 +0200
Jesper Dangaard Brouer <bro...@redhat.com> wrote:
> On Tue, 24 May 2016 23:34:14 +0300
> "Michael S. Tsirkin" <m...@redhat.com> wrote:
>
> > On Tue, May 24, 2016 at 07:03:20PM +0200, Jesper Dangaard Brouer wrote:
> >
ppers around ptr_array.
^
It is called "ptr_ring" not "ptr_array".
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
t; destroy now scans the array and frees all queued skbs
>
> changes since v5
> implemented a generic ptr_ring api, and
> made skb_array a type-safe wrapper
> apis for taking the spinlock in different contexts
> following expected usecase
On Mon, 13 Jun 2016 23:54:50 +0300
"Michael S. Tsirkin" <m...@redhat.com> wrote:
> Update skb_array after ptr_ring API changes.
>
> Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Acked-by: Jesper Dangaard Brouer <bro...@redhat.com>
Tested-by: Jesp
ains destructor callback such that
> all pointers in queue can be cleaned up.
>
> This changes some APIs but we don't have any users yet,
> so it won't break bisect.
>
> Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Acked-by: Jesper Dangaard Brouer <bro...@redhat
On Mon, 13 Jun 2016 23:54:31 +0300
"Michael S. Tsirkin" <m...@redhat.com> wrote:
> A simple array based FIFO of pointers. Intended for net stack which
> commonly has a single consumer/producer.
>
> Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
Ack
kin <m...@redhat.com>
Acked-by: Jesper Dangaard Brouer <bro...@redhat.com>
Tested-by: Jesper Dangaard Brouer <bro...@redhat.com>
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
_list = NULL;
> +
> mutex_unlock(_mutex);
> +
> + while (head) {
> + struct sk_buff *next = head->next;
> +
> + kfree_skb(head);
> + cond_resched();
> + head = next;
> + }
> }
This looks a lot like kfree_skb_list()
What about bulk free'ing SKBs here?
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
On Tue, 24 May 2016 23:34:14 +0300
"Michael S. Tsirkin" <m...@redhat.com> wrote:
> On Tue, May 24, 2016 at 07:03:20PM +0200, Jesper Dangaard Brouer wrote:
> >
> > On Tue, 24 May 2016 12:28:09 +0200
> > Jesper Dangaard Brouer <bro...@redhat.com> wrot
pps
>
> Now we should work to add batches on the enqueue() side ;)
Yes, please! :-))) That will be the next big step!
> Signed-off-by: Eric Dumazet <eduma...@google.com>
> Cc: John Fastabend <john.r.fastab...@intel.com>
> Cc: Jesper Dangaard Brouer <bro...@redhat.co
uct sk_buff *skb,
> struct Qdisc *q,
> }
> }
> spin_unlock(root_lock);
> + if (unlikely(to_free))
> + kfree_skb_list(to_free);
Great, now there is a good argument for implementing kmem_cache bulk
freeing inside kfree_skb_list(). I did a ugly
series brings a nice qdisc performance increase (more than 80 %
> in some cases).
Thanks for working on this Eric! this is great work! :-)
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
On Wed, 22 Jun 2016 07:55:43 -0700
Eric Dumazet <eric.duma...@gmail.com> wrote:
> On Wed, 2016-06-22 at 16:47 +0200, Jesper Dangaard Brouer wrote:
> > On Tue, 21 Jun 2016 23:16:48 -0700
> > Eric Dumazet <eduma...@google.com> wrote:
> >
> > >
On Wed, 22 Jun 2016 09:49:48 -0700
Eric Dumazet <eric.duma...@gmail.com> wrote:
> On Wed, 2016-06-22 at 17:44 +0200, Jesper Dangaard Brouer wrote:
> > On Wed, 22 Jun 2016 07:55:43 -0700
> > Eric Dumazet <eric.duma...@gmail.com> wrote:
> >
> > >
1.1-4))
Joint work with Alexander Duyck.
Signed-off-by: Alexander Duyck <alexander.h.du...@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --gi
needed. This due to netpoll can call from
IRQ context.
Signed-off-by: Alexander Duyck <alexander.h.du...@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
include/linux/skbuff.h |1 +
net/core/dev.c |8 +++-
net/core/skbuff.c |8 +
, e.g. replacing their calles to dev_kfree_skb() /
dev_consume_skb_any().
Driver ixgbe is the first user of this new API.
[1] http://thread.gmane.org/gmane.linux.network/384302/focus=397373
---
Jesper Dangaard Brouer (3):
net: bulk free infrastructure for NAPI context, use napi_consume_skb
is to see if budget is 0. In that case, we
need to invoke dev_consume_skb_irq().
Joint work with Alexander Duyck.
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.du...@redhat.com>
---
include/linux/skbuff.h |3 ++
ne
On Tue, 9 Feb 2016 13:57:41 +0200
Saeed Mahameed <sae...@dev.mellanox.co.il> wrote:
> On Tue, Feb 2, 2016 at 11:13 PM, Jesper Dangaard Brouer
> <bro...@redhat.com> wrote:
> > There are several techniques/concepts combined in this optimization.
> > It is both a da
pecially it have been difficult to get
people to really adopt "ip", which is also worst search term in the
Internet today...
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
On Thu, 28 Jan 2016 08:37:07 -0800
Tom Herbert <t...@herbertland.com> wrote:
> On Thu, Jan 28, 2016 at 4:45 AM, Eric Dumazet <eric.duma...@gmail.com> wrote:
> > On Thu, 2016-01-28 at 10:25 +0100, Jesper Dangaard Brouer wrote:
> >
> >> Yes, th
On Wed, 27 Jan 2016 18:50:27 -0800
Tom Herbert <t...@herbertland.com> wrote:
> On Wed, Jan 27, 2016 at 12:47 PM, Jesper Dangaard Brouer
> <bro...@redhat.com> wrote:
> > On Mon, 25 Jan 2016 23:10:16 +0100
> > Jesper Dangaard Brouer <bro...@redhat.com> wrote:
>
On Wed, 27 Jan 2016 13:56:03 -0800
Alexei Starovoitov <alexei.starovoi...@gmail.com> wrote:
> On Wed, Jan 27, 2016 at 09:47:50PM +0100, Jesper Dangaard Brouer wrote:
> > Sum: 18.75 % => calc: 30.0 ns (sum: 30.0 ns) => Total: 159.9 ns
> >
> > To get around the
NEED TO CLEAN UP PATCH (likely still contains bugs...)
When enabling Receive Packet Steering (RPS) like :
echo 32768 > /proc/sys/net/core/rps_sock_flow_entries
for N in $(seq 0 7) ; do
echo 4096 > /sys/class/net/${DEV}/queues/rx-$N/rps_flow_cnt
echo f >
maintains a qlen, which is unnecessary in this hotpath code.
A simple list within the first SKB could be a minimum solution.
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 12 +---
include/linux/netde
Normal TX completion uses napi_consume_skb(), thus also make dummy driver
use this, as it make it easier to see the effect of bulk freeing SKBs.
---
drivers/net/dummy.c |3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index
Use the newly introduced napi_alloc_skb_hint() API, to get the underlying
slab bulk allocation sizes to align with what mlx5 driver need for refilling
its RX ring queue.
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en.h
is the mlx5 driver, which bulk re-populate it's RX ring
with both SKBs and pages. Thus, it would like to work with
bigger bulk alloc chunks.
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
include/linux/skbuff.h | 19 +++
net/core/skbuff.c |8 +++--
packet from the RX
ring and starting the prefetching, and the second loop calling
eth_type_trans() and invoking the stack via napi_gro_receive().
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
Notes:
This is the patch that gave a speed up of 6.2Mpps to 12Mpps, when
trying to m
allocation, knowing the size of objects it need to refill. For
now, just use the default bulk size hidden inside napi_alloc_skb().
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en.h |4 ++--
drivers/net/ethernet/mellanox/mlx
is to see if budget is 0. In that case, we
need to invoke dev_consume_skb_irq().
Joint work with Alexander Duyck.
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.du...@redhat.com>
---
include/linux/skbuff.h |3 ++
ne
needed. This due to netpoll can call from
IRQ context.
Signed-off-by: Alexander Duyck <alexander.h.du...@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
include/linux/skbuff.h |1 +
net/core/dev.c |8 +++-
net/core/skbuff.c |8 +
oF [1]
[1]
http://netdevconf.org/1.1/bof-network-performance-bof-jesper-dangaard-brouer.html
[2] http://thread.gmane.org/gmane.linux.network/384302/
---
Jesper Dangaard Brouer (11):
net: bulk free infrastructure for NAPI context, use napi_consume_skb
net: bulk free SKBs that were del
t (normal hidden by prefetch)
* In case RX queue is not full, alloc and free more SKBs than needed
More testing is needed with more real life benchmarks.
Joint work with Alexander Duyck.
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
Signed-off-by: Alexander Duyck <alexander.
1.1-4))
Joint work with Alexander Duyck.
Signed-off-by: Alexander Duyck <alexander.h.du...@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --gi
On Tue, 2 Feb 2016 16:52:50 -0800
Alexei Starovoitov <alexei.starovoi...@gmail.com> wrote:
> On Tue, Feb 02, 2016 at 10:12:01PM +0100, Jesper Dangaard Brouer wrote:
> > Think twice before applying
> > - This patch can potentially introduce added latency in some workload
port set DEV/PORT_INDEX [ type { eth | ib | auto} ]
>
> butter:~$ dl port show
> devlink0/1: type ib ibdev mlx4_0
> devlink0/2: type ib ibdev mlx4_0
>
> butter:~$ sudo dl port set devlink0/1 type eth
>
> butter:~$ dl port show
> devlink0/1: type eth netdev ens4
> d
On Mon, 25 Jan 2016 23:10:16 +0100
Jesper Dangaard Brouer <bro...@redhat.com> wrote:
> On Mon, 25 Jan 2016 09:50:16 -0800 John Fastabend <john.fastab...@gmail.com>
> wrote:
>
> > On 16-01-25 09:09 AM, Tom Herbert wrote:
> > > On Mon, Jan 25, 2016 at 5:15 A
On Thu, 21 Jan 2016 14:49:25 +0200 Or Gerlitz <gerlitz...@gmail.com> wrote:
> On Thu, Jan 21, 2016 at 1:27 PM, Jesper Dangaard Brouer
> <bro...@redhat.com> wrote:
> > On Wed, 20 Jan 2016 15:27:38 -0800 Tom Herbert <t...@herbertland.com> wrote:
> >
between packets:
* 10 Gbit/s -> 67.2 nanosec
* 40 Gbit/s -> 16.8 nanosec
* 100 Gbit/s -> 6.7 nanosec
Adding such a per packet cost is not going to fly.
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analy
st match the devices dev->dev_addr (else a
SW compare is required).
Is that doable in hardware?
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
On Thu, 21 Jan 2016 00:34:16 +0200
Or Gerlitz <gerlitz...@gmail.com> wrote:
> On Wed, Jan 20, 2016 at 11:13 AM, Jesper Dangaard Brouer
> <bro...@redhat.com> wrote:
> > Hi All,
> >
> > I wrote a small tool[1] to extract ethtool --statistics|-S, sample and
>
"slowpath" case:
SLUB => 117 cycles(tsc) 29.276 ns
SLAB => 101 cycles(tsc) 25.342 ns
I've addressed this "slowpath" problem in the SLUB and SLAB allocators,
by introducing a bulk API, which amortize the needed sync-mechanisms.
Kmem_cache using bulk API:
SLUB =>
6-01-24 06:44 AM, Michael S. Tsirkin wrote:
> > On Sun, Jan 24, 2016 at 03:28:14PM +0100, Jesper Dangaard Brouer wrote:
> >> On Thu, 21 Jan 2016 10:54:01 -0800 (PST)
> >> David Miller <da...@davemloft.net> wrote:
> >>
> >>> From: Jesper Danga
On Mon, 25 Jan 2016 09:50:16 -0800 John Fastabend <john.fastab...@gmail.com>
wrote:
> On 16-01-25 09:09 AM, Tom Herbert wrote:
> > On Mon, Jan 25, 2016 at 5:15 AM, Jesper Dangaard Brouer
> > <bro...@redhat.com> wrote:
> >>
[...]
> >>
> &g
On Thu, 21 Jan 2016 10:54:01 -0800 (PST)
David Miller <da...@davemloft.net> wrote:
> From: Jesper Dangaard Brouer <bro...@redhat.com>
> Date: Thu, 21 Jan 2016 12:27:30 +0100
>
> > eth_type_trans() does two things:
> >
> > 1) determine skb->protocol
>
On Fri, 22 Jan 2016 09:07:43 -0800
Tom Herbert <t...@herbertland.com> wrote:
> On Fri, Jan 22, 2016 at 4:33 AM, Jesper Dangaard Brouer
> <bro...@redhat.com> wrote:
> > On Thu, 21 Jan 2016 09:48:36 -0800
> > Eric Dumazet <eric.duma...@gmail.com> wrote:
> >
.gmane.org/gmane.linux.network/402503/focus=403386
Patchset based on net-next at commit 3ebeac1d0295
---
Jesper Dangaard Brouer (3):
net: adjust napi_consume_skb to handle none-NAPI callers
mlx4: use napi_consume_skb API to get bulk free operations
mlx5: use napi_consume_skb API t
Bulk free of SKBs happen transparently by the API call napi_consume_skb().
The napi budget parameter is needed by napi_consume_skb() to detect
if called from netpoll.
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en.h
not originating from NAPI/softirq.
Simply handled by using dev_consume_skb_any().
This adds an extra branch+call for the netpoll case (checking
in_irq() + irqs_disabled()), but that is okay as this is a slowpath.
Suggested-by: Alexander Duyck <adu...@mirantis.com>
Signed-off-by: Jesper Dangaard Broue
for the function call that needed
this distinction.
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
drivers/net/ethernet/mellanox/mlx4/en_tx.c | 16 ++--
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
b/d
not originating from NAPI/softirq.
Simply handled by using dev_consume_skb_any().
This adds an extra branch+call for the netpoll case (checking
in_irq() + irqs_disabled()), but that is okay as this is a slowpath.
Suggested-by: Alexander Duyck <adu...@mirantis.com>
Signed-off-by: Jesper Dangaard Broue
.gmane.org/gmane.linux.network/402503/focus=403386
Patchset based on net-next at commit 3ebeac1d0295
V3: spelling fixes from Sergei
---
Jesper Dangaard Brouer (3):
net: adjust napi_consume_skb to handle none-NAPI callers
mlx4: use napi_consume_skb API to get bulk free operations
Bulk free of SKBs happen transparently by the API call napi_consume_skb().
The napi budget parameter is needed by napi_consume_skb() to detect
if called from netpoll.
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en.h
for the function call that needed
this distinction.
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
drivers/net/ethernet/mellanox/mlx4/en_tx.c | 16 ++--
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
b/d
from NAPI/softirq.
Simply handled by using dev_consume_skb_any().
This adds an extra branch+call for the netpoll case (checking
in_irq() + irqs_disabled()), but that is okay as this is a slowpath.
Suggested-by: Alexander Duyck <adu...@mirantis.com>
Signed-off-by: Jesper Dangaard Broue
Bulk free of SKBs happen transparently by the API call napi_consume_skb().
The napi budget parameter is needed by napi_consume_skb() to detect
if called from netpoll.
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en.h
for the function call that needed
this distinction.
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
drivers/net/ethernet/mellanox/mlx4/en_tx.c | 15 +--
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
b/drive
101 - 200 of 1532 matches
Mail list logo