Ondrej Zajicek santi...@crfreenet.org wrote on 2010/04/21 20:15:07:
On Wed, Apr 21, 2010 at 09:41:47AM +0200, Joakim Tjernlund wrote:
I am using Quagga ATM but I had a quick look at BIRD and I got a few
observations.
Hello. Thank you for your tips and notes.
The LSA/checksum
Joakim Tjernlund/Transmode wrote on 2010/04/21 22:04:06:
Ondrej Zajicek santi...@crfreenet.org wrote on 2010/04/21 20:15:07:
On Wed, Apr 21, 2010 at 09:41:47AM +0200, Joakim Tjernlund wrote:
I am using Quagga ATM but I had a quick look at BIRD and I got a few
observations.
Hello.
This is a much simpler and efficent impl. of the IP checksum.
It is a dry port from Quagga and I have not tested it.
---
lib/checksum.c | 44 +++-
1 files changed, 11 insertions(+), 33 deletions(-)
diff --git a/lib/checksum.c b/lib/checksum.c
index
On Fri, Apr 23, 2010 at 08:27:40AM +0200, Joakim Tjernlund wrote:
Yes, but not at the moment. The endian problem should be addressed when you
build the lsa.
Does this help at all? In any case, the (int) cast should be there.
Removing of endianity swap is correct only if the Fletcher
Ondrej Zajicek santi...@crfreenet.org wrote on 2010/04/23 11:52:26:
From: Ondrej Zajicek santi...@crfreenet.org
To: Joakim Tjernlund joakim.tjernl...@transmode.se
Cc: bird-us...@trubka.network.cz
Date: 2010/04/23 11:46
Subject: Re: OSPF performance/SPF calculations
On Fri, Apr 23, 2010
On Fri, Apr 23, 2010 at 10:13:32AM +0200, Wolfgang Hennerbichler wrote:
now I've setup BIRD to peer on the different source interfaces and from
different ASes to simulate productive routers:
protocol bgp R1 {
debug all;
local as 1120;
neighbor 193.203.0.3 as 1267;
import
On Fri, Apr 23, 2010 at 01:06:20PM +0200, Joakim Tjernlund wrote:
I must be missing something then(not surprising as I just started looking
at BIRD). Why do you need the separate allocation for the body of the LSA
then?
Why not just adding entries to the allocated LSA header?
Ahh, I am
Hello!
This is a much simpler and efficent impl. of the IP checksum.
It is a dry port from Quagga and I have not tested it.
Are you convinced that your version is more efficient? The original version
processes 32 bits at a time, while your code does only 16 bits at a time. It
might
be worth
Ondrej Zajicek santi...@crfreenet.org wrote on 2010/04/23 14:22:18:
On Fri, Apr 23, 2010 at 01:06:20PM +0200, Joakim Tjernlund wrote:
I must be missing something then(not surprising as I just started looking
at BIRD). Why do you need the separate allocation for the body of the LSA
Martin Mares m...@ucw.cz wrote on 2010/04/23 15:03:34:
Hello!
This is a much simpler and efficent impl. of the IP checksum.
It is a dry port from Quagga and I have not tested it.
Are you convinced that your version is more efficient? The original version
processes 32 bits at a time,
Hello!
Fairly, I once had the same idea for Quagga but found all those extra tests
and
additions were much slower(I benched it). Just look at the number of extra
ops one
has to do in the current code.
If you want to do it faster you have to go ASM. It would be easy to
add support for
Hello!
But it slows down the fletcher checksum as one need to test and extra
calculations because of this. Any gain by the separation is lost many times
over
in the fletcher checksum which could be as simple as(from Quagga with my
tweaks):
Again, do you have any real numbers backing this
Hello!
But you can't get rid of:
z + (z sum)
which is the real bottleneck. Perhaps this doesn't cost much
on high end CPUs but it sure does on embedded CPUs
Why should it be? It can be compiled as a sequence of add with carry
instructions, can't it?
Have
Hello!
But you can't get rid of:
z + (z sum)
which is the real bottleneck. Perhaps this doesn't cost much
on high end CPUs but it sure does on embedded CPUs
Why should it be? It can be compiled as a sequence of add with carry
instructions, can't it?
Yes, but have you seen gcc do
Ondrej Zajicek santi...@crfreenet.org wrote on 2010/04/23 16:09:00:
On Fri, Apr 23, 2010 at 10:54:33AM +0200, Joakim Tjernlund wrote:
Removing of endianity swap is correct only if the Fletcher checksum
would return the same value regardless of endianity swap. Is this
a property of the
Joakim Tjernlund/Transmode wrote on 2010/04/23 16:14:58:
Hello!
But you can't get rid of:
z + (z sum)
which is the real bottleneck. Perhaps this doesn't cost much
on high end CPUs but it sure does on embedded CPUs
Why should it be? It can be compiled as a sequence of add
On Fri, Apr 23, 2010 at 04:12:27PM +0200, Joakim Tjernlund wrote:
As i looked on the Fletcher checksum, it seems that you cannot just swap
the result instead of swapping the checked data.
Then there is a bug else where. Fletcher as such does not
depend on host endian. It operates on bytes
Hello!
Just tried this and it didn't with gcc 3.4.3 on PowerPC
It would be better to let gcc unroll the loop (if it is critical for
performance, it should be unrolled anyway) and use a newer version of gcc.
Some arch does not have an add with carry insn(MIPS?)
Well, first of all, we should
On Fri, Apr 23, 2010 at 04:12:27PM +0200, Joakim Tjernlund wrote:
As i looked on the Fletcher checksum, it seems that you cannot just swap
the result instead of swapping the checked data.
Then there is a bug else where. Fletcher as such does not
depend on host endian. It operates on
On Fri, Apr 23, 2010 at 05:29:09PM +0200, Joakim Tjernlund wrote:
But you also need LSAs in host endianity when doing SPF calculation.
Although it would be probably possible to change SPF calculation to
use directly BE values it would be huge work and it is questionable
whether it wouldn't
On 23.4.2010 18:32, Ondrej Zajicek wrote:
On Fri, Apr 23, 2010 at 03:20:55PM +0200, Martin Mares wrote:
Hello!
Fairly, I once had the same idea for Quagga but found all those extra tests
and
additions were much slower(I benched it). Just look at the number of extra
ops one
has to do in
Ondrej Filip fe...@network.cz wrote on 2010/04/23 18:41:57:
On 23.4.2010 18:32, Ondrej Zajicek wrote:
On Fri, Apr 23, 2010 at 03:20:55PM +0200, Martin Mares wrote:
Hello!
Fairly, I once had the same idea for Quagga but found all those extra
tests and
additions were much slower(I
On 23.4.2010 14:22, Ondrej Zajicek wrote:
On Fri, Apr 23, 2010 at 01:06:20PM +0200, Joakim Tjernlund wrote:
I must be missing something then(not surprising as I just started looking
at BIRD). Why do you need the separate allocation for the body of the LSA
then?
Why not just adding entries to
Ondrej Zajicek santi...@crfreenet.org wrote on 2010/04/23 18:46:19:
On Fri, Apr 23, 2010 at 05:29:09PM +0200, Joakim Tjernlund wrote:
But you also need LSAs in host endianity when doing SPF calculation.
Although it would be probably possible to change SPF calculation to
use directly BE
Hello!
So there isn't really difference in performance of both
implementations. Even on slow embedded AMD Geode CPU, it gives
~ 180 MB/s.
No difference? what does 1.2 mean? to me this means 20% which is a lot
Yes, but according to Santiago's benchmarks, your code is sometimes 20%
On 23.4.2010 19:01, Joakim Tjernlund wrote:
Ondrej Filip fe...@network.cz wrote on 2010/04/23 18:41:57:
On 23.4.2010 18:32, Ondrej Zajicek wrote:
On Fri, Apr 23, 2010 at 03:20:55PM +0200, Martin Mares wrote:
Hello!
My primary reaction was If something isn't broken, don't fix it. I.e.,
Martin Mares m...@ucw.cz wrote on 2010/04/23 19:23:18:
Hello!
So there isn't really difference in performance of both
implementations. Even on slow embedded AMD Geode CPU, it gives
~ 180 MB/s.
No difference? what does 1.2 mean? to me this means 20% which is a lot
Yes, but
Hello!
Peformance matter, especially when the network grows. Is this the way
BIRD development works? Only work on stuff that currently feels important
is acceptet?
Yes, performance matters. This is why performance optimizations in BIRD
have to be justified by real numbers, not by hand-waving.
Ondrej Filip fe...@network.cz wrote on 2010/04/23 19:28:27:
On 23.4.2010 19:01, Joakim Tjernlund wrote:
Ondrej Filip fe...@network.cz wrote on 2010/04/23 18:41:57:
On 23.4.2010 18:32, Ondrej Zajicek wrote:
On Fri, Apr 23, 2010 at 03:20:55PM +0200, Martin Mares wrote:
Hello!
My
Again, BIRD is used on some very important places and therefore we are
very conservative in accepting new patches. But I don't think we have
NIH syndrome. We have been accepting foreign patches since beginning
and Ondrej Zajicek is a good example. :-) He had sent me some new
patches and later
Hello!
So basically you are saying that outsiders like my self aren't welcome
because BIRD is so important to some IXPs that you don't want to
take any chanches?
Certainly not. However, it means that the criteria for accepting patches
are somewhat stricter than in many other projects.
All
On Fri, Apr 23, 2010 at 07:40:28PM +0200, Joakim Tjernlund wrote:
So there isn't really difference in performance of both
implementations. Even on slow embedded AMD Geode CPU, it gives
~ 180 MB/s.
No difference? what does 1.2 mean? to me this means 20% which is a lot
Yes,
On Fri, Apr 23, 2010 at 07:40:28PM +0200, Joakim Tjernlund wrote:
Martin Mares m...@ucw.cz wrote on 2010/04/23 19:23:18:
Hello!
So there isn't really difference in performance of both
implementations. Even on slow embedded AMD Geode CPU, it gives
~ 180 MB/s.
No
Ondrej Zajicek santi...@crfreenet.org wrote on 2010/04/23 21:39:06:
On Fri, Apr 23, 2010 at 07:40:28PM +0200, Joakim Tjernlund wrote:
Martin Mares m...@ucw.cz wrote on 2010/04/23 19:23:18:
Hello!
So there isn't really difference in performance of both
implementations. Even
Martin Mares m...@ucw.cz wrote on 2010/04/23 23:02:29:
Hello!
How did you come to the conclusion the the current code was better than
the previous version? Seems like hand waving to me.
Did I claim anywhere that the old code is better? I only pointed out
You did when you commited it
35 matches
Mail list logo