due 9 October 2005

Paul Jakma Fri, 10 Mar 2006 03:24:48 -0800

Hi Claudio,

Note: Any comment below straying from the technical is intended as/friendly/ banter. Please don't read it any other way!


On Tue, 7 Mar 2006, Claudio Jeker wrote:

I was notified about this rather long thread by Paul Jakma aboutOpenBGPD and I think I have to clarify a few things.


I figured you'd hear of it, and am very interested to hear your side.

Paul is talking a lot about memory requirements and uses the numbersof a bug report to compare OpenBGPD with quagga.

I did apologise in advance for doing that. Those were the only numbers Icould find with google. I only used the numbers of the accounted-forusage (which I presumed did not include leaked RAM, given it was so muchless and not too inconsistent either with the total usage reported priorto soft-reconfig).

Also, the reporter mentioned elsewhere in that thread that their/normal/ usage, prior to the soft-reconfig integration, was:


  "Before the upgrade, I was running at something like 60-80 if I
   remember it well."

That's not quite the "25MB" often mentioned in OpenBGPd presentationsfor /two/ full feeds. ;) So either:


- the reporter's memory was wrong,

or

- the much-bandied "2 full feeds -> 25MB" figure either:
  - doesn't tie in with operational reality,
  or
  - is derived with feeds which are 'best case'
    (E.g.: Both from the same upstream, so the AS_PATHs are exact same)
  or
  - is simply out of date, e.g. maybe because:
    - bigger full feeds these days
    - the figure refers to older OpenBGPd and memory needs have changed.

?

FWIW, here's data from a Quagga 0.99-today bgpd (with a couple of smallchanges to the definition of a structure, see below) with one full 175kfeed, and another ~25k partial-feed (from a different upstream), 65kAS_PATH attributes, 100k general attributes total:


VmPeak:    86516 kB
VmSize:    85920 kB
VmRSS:     82084 kB
VmData:    80824 kB
VmStk:        88 kB
VmExe:       648 kB
VmLib:      3676 kB

85MB, which is quite comparable to the usage the "60 to 80MB" theOpenBGPd user I quoted reported for pre-soft-reconfig, presuming theyweren't mistaken.

The 'zebra' daemon is quite bloated, 61MB. Though, that should stayconstant, as explained in other emails. We'll try fix that at some stage(it stores nexthop information per-route, which is probably worth 40 to50% of RAM usage, and mostly utterly redundant).

If I count correctly I get 150MB memory usage and some sessions are upsince more than 8 weeks. Yes, this box has soft-reconfig in disabledbut it would not matter anyway because there is no filtering done.


Very good indeed, but I still don't think it's /incomparable/ to Quagga.

Out of curiosity, what would that figure look like with soft-reconfigenabled? (An estimate even.). Quagga's soft-reconf overhead is minimal.(no extra RIB table entries, one 32k struct per BGP path. If nomodifications were made to attributes - ie filter only - no additionalattribute overheads. So about 5MB per 180k full-feed, I think).

Here are figures for a Quagga bgpd that is roughly similar to yourconfig, they are from the RIPE NCC "RIS" routing-data collector project(see http://www.ris.ripe.net):


- older Quagga, 0.96.5, but I believe it should still be representative
- just under 100 peers
- All but a few sessions are several weeks old
- More than 5 sessions are up for *many* months
- not forwarding, hence not running zebra
  (explains previous line, given the version)
- 8 full ~180k feeds,
- slightly fewer distinct AS_PATH entries though than yours: ~256k

Memory usage is:

VmSize:   187252 kB
VmLck:         0 kB
VmRSS:    183072 kB
VmData:   184488 kB
VmStk:        36 kB
VmExe:       596 kB
VmLib:      1972 kB

187MB, versus the 150MB for OpenBGPd. So there's a 20% difference fromOpenBGPd -> Quagga. That's certainly still comparable at least.


Here's one with:

- order of 150ish peers
- most sessions are up for several weeks
- again, not forwarding, not running zebra
- 13 full ~180k IPv4 feeds
- also 0.96.5
- about 55% more AS_PATH attributes (~400k):

VmSize:   278992 kB
VmLck:         0 kB
VmRSS:    262996 kB
VmData:   276224 kB
VmStk:        40 kB
VmExe:       596 kB
VmLib:      1972 kB

Memory usage is 55% greater, nearly exactly in line with the AS_PATHattribute difference.

Now, there's actually some really 'low-hanging fruit'; junk thatshouldn't be in our core data structures and wasted padding due to poorlayout. That appears to be good for roughly a ~6% reduction in size onILP32 machines, possibly a bit more on LP64, and I'll commit the changessoon.

That should get us down a /little/ bit more to OpenBGPd memory-usagelevels, maybe within 15%. But still, our current memory usage is *not*that dreadful, not even compared to OpenBGPd, imho.

Despite this, the presentations ye (OpenBGPd developers) have beengiving at various forums paint a slightly different picture: "Quagga isbloated" and "OpenBGPd is really memory efficient" "25MB for two fullfeeds", "not even half of other implementations" to paraphrase somethings I've read/heard from presentations. E.g., see:


        http://ezine.daemonnews.org/200603/openbgpd.html

from Henning's talk at NANOG recently.

I'm not sure those characterisations are /quite/ fair.

They don't seem to be representative even of OpenBGPd, other thanpossibly as 'best case' and I think you quite definitely misrepresentthe cases of operational reality (ie soft-reconfig) and of otherimplementations, unintentionally or not.

It might be an idea in future to provide details about thecharacteristics of the feeds used when you provide memory usage figuresto audiences (distinct AS_PATHs, number of prefixes, distinctattributes, etc. anything which is a significant influence on memoryusage).

Another urban legend is the coolness of "dynamic route refresh".
OpenBGPD does announce the route refresh capability and supports request
from other peers but we do not have the button to make a refresh request
ourself. There is a simple reason why. Because "route refresh" as in RFC
2918 is totaly useless.


Route-refresh has some shortcomings, it's hardly useless though.

Yes, you have to trust your peer resends its table, but there are manyother things you have to trust your peer to do correctly (includingsending you the original updates in the first place..).

Is there operational experience to back up this "unuseable" claim? Mydirect experience with Quagga is that RR is quite useable, and I know ofothers who rely solely on RR. Looking at the OpenBGPd code, you seemquite capable of reliably sending table dumps on RR and presumably couldeasily accept them too.

Is it maybe considered useless because other, probably early,implementation(s) of RR got it wrong?

Now that's great you don't know when the refresh is finished. It iseven worse you don't know if it worked at all in the first place.


So the long stream of UPDATEs for 175k+ prefixes isn't a clue? :)

You can actually tell whether routes you accepted previously have beenresent or not, you need to keep just one bit of information for eachroute in the RIB. If you really had to know (not that I'd advocate thisbeing worthwhile).


Can't detect other routes though, no.

The End-of-RIB marker introduced in the BGP Graceful-Restart proposalshould have bin in RFC 2918 from the beginning.

ACK, it should be split out as a seperate draft and a standalonecapability, not tied in with the GR capability. It could be donerelatively easily. And would be great (I have at least one otherprospective use for it).

I wonder if any other markers might be needed, e.g. I wonder ifStart-of-Rib would be an idea too (otherwise possibly a /very/ longtimer to time-out on not receiving EoR). Needs thought.

Fancy working on this? Figure out exactly what markers would be usefulfor Quagga and OpenBGPd, and what semantics they should have? Could bevery useful.

We consider the use of RFC 2918 as a substitution of real inbound softreconfiguration as unusable until the named issues have been solved.

Adding a knob to bgpctl to issue a refresh request is not a big issue.

So add it, the users will figure out which form of reconfig best suitstheir needs. :)

Users with tight memory-constraints may well prefer route-refresh oversoft-reconfig, despite the flaws.


Thanks for your reply!

regards,
--
Paul Jakma,
Network Approachability, KISS.          http://quagga.ireland.sun.com/
Sun Microsystems, Dublin, Ireland.      tel: EMEA x19190 / +353 1 819 9190
_______________________________________________
networking-discuss mailing list
[email protected]

Re: [networking-discuss] Re: quagga/SMF routing management design review -> due 9 October 2005

Reply via email to