On Fri, Sep 16, 2011 at 07:50:21AM -0500, Claudio Jeker wrote:
> > I find that during start-up, the CPU of the "route decision engine"
> > process is steady between 90-100%.  During this time, "bgpctl" hangs.
> > This lasts at least 45 minutes.
> > 
> > I believe most of the CPU is spent in "path_lookup()", traversing the
> > linked list in pathtable.path_hashtbl[].  I think a suitable fix would be
> > to increase the hash table sizes (rde.c:152):
> 
> Is this just a "believe" or is there some profiling output behind this?
> I know that when many sessions open at the same time it will lock the RDE
> for a long time. It is something that I shold finally invest some time in
> to solve.

I've taken several core dumps while the problem is happening, and the
stack trace is the same.  I am having trouble getting RDE to dump
profiling data, but I'm still working on that.

> > This change should probably be coupled with a better hash calculation.
> 
> Again, is this just a gut feeling or is there statistical data behind this
> statement that shows that the current hash is performing badly...

Intuition.  The hash calculation only takes the AS_PATH into account.  In
my case, there are many routes with the same AS_PATH.  I have at least 40
copies of every route.  I will instrument bgpd to confirm.

> > Finally, I was surprised to see double the prefix entry count.  I carry
> > 21M routes (add up the last column of "bgpctl show").  Yet, the output
> > above shows 42M "prefix entries".  I do not modify the prefixes at all; my
> > rule set contains only "deny to any; allow from any".
> 
> Yes that seems strange. For a quick fix try setting "softreconfig in no".
> Also run "bgpd -nv" to see if some other setting may cause altering of
> prefixes.

Thanks for the tip.

-- 
 kevin brintnall
 Network Engineer, CenturyLink

Reply via email to