On Fri, Sep 16, 2011 at 07:50:21AM -0500, Claudio Jeker wrote: > > I find that during start-up, the CPU of the "route decision engine" > > process is steady between 90-100%. During this time, "bgpctl" hangs. > > This lasts at least 45 minutes. > > > > I believe most of the CPU is spent in "path_lookup()", traversing the > > linked list in pathtable.path_hashtbl[]. I think a suitable fix would be > > to increase the hash table sizes (rde.c:152): > > Is this just a "believe" or is there some profiling output behind this? > I know that when many sessions open at the same time it will lock the RDE > for a long time. It is something that I shold finally invest some time in > to solve.
I've taken several core dumps while the problem is happening, and the stack trace is the same. I am having trouble getting RDE to dump profiling data, but I'm still working on that. > > This change should probably be coupled with a better hash calculation. > > Again, is this just a gut feeling or is there statistical data behind this > statement that shows that the current hash is performing badly... Intuition. The hash calculation only takes the AS_PATH into account. In my case, there are many routes with the same AS_PATH. I have at least 40 copies of every route. I will instrument bgpd to confirm. > > Finally, I was surprised to see double the prefix entry count. I carry > > 21M routes (add up the last column of "bgpctl show"). Yet, the output > > above shows 42M "prefix entries". I do not modify the prefixes at all; my > > rule set contains only "deny to any; allow from any". > > Yes that seems strange. For a quick fix try setting "softreconfig in no". > Also run "bgpd -nv" to see if some other setting may cause altering of > prefixes. Thanks for the tip. -- kevin brintnall Network Engineer, CenturyLink

