Hi Claudio,

Note: Any comment below straying from the technical is intended as /friendly/ banter. Please don't read it any other way!

On Tue, 7 Mar 2006, Claudio Jeker wrote:

I was notified about this rather long thread by Paul Jakma about OpenBGPD and I think I have to clarify a few things.

I figured you'd hear of it, and am very interested to hear your side.

Paul is talking a lot about memory requirements and uses the numbers of a bug report to compare OpenBGPD with quagga.

I did apologise in advance for doing that. Those were the only numbers I could find with google. I only used the numbers of the accounted-for usage (which I presumed did not include leaked RAM, given it was so much less and not too inconsistent either with the total usage reported prior to soft-reconfig).

Also, the reporter mentioned elsewhere in that thread that their /normal/ usage, prior to the soft-reconfig integration, was:

  "Before the upgrade, I was running at something like 60-80 if I
   remember it well."

That's not quite the "25MB" often mentioned in OpenBGPd presentations for /two/ full feeds. ;) So either:

- the reporter's memory was wrong,

or

- the much-bandied "2 full feeds -> 25MB" figure either:
  - doesn't tie in with operational reality,
  or
  - is derived with feeds which are 'best case'
    (E.g.: Both from the same upstream, so the AS_PATHs are exact same)
  or
  - is simply out of date, e.g. maybe because:
    - bigger full feeds these days
    - the figure refers to older OpenBGPd and memory needs have changed.

?

FWIW, here's data from a Quagga 0.99-today bgpd (with a couple of small changes to the definition of a structure, see below) with one full 175k feed, and another ~25k partial-feed (from a different upstream), 65k AS_PATH attributes, 100k general attributes total:

VmPeak:    86516 kB
VmSize:    85920 kB
VmRSS:     82084 kB
VmData:    80824 kB
VmStk:        88 kB
VmExe:       648 kB
VmLib:      3676 kB

85MB, which is quite comparable to the usage the "60 to 80MB" the OpenBGPd user I quoted reported for pre-soft-reconfig, presuming they weren't mistaken.

The 'zebra' daemon is quite bloated, 61MB. Though, that should stay constant, as explained in other emails. We'll try fix that at some stage (it stores nexthop information per-route, which is probably worth 40 to 50% of RAM usage, and mostly utterly redundant).

If I count correctly I get 150MB memory usage and some sessions are up since more than 8 weeks. Yes, this box has soft-reconfig in disabled but it would not matter anyway because there is no filtering done.

Very good indeed, but I still don't think it's /incomparable/ to Quagga.

Out of curiosity, what would that figure look like with soft-reconfig enabled? (An estimate even.). Quagga's soft-reconf overhead is minimal. (no extra RIB table entries, one 32k struct per BGP path. If no modifications were made to attributes - ie filter only - no additional attribute overheads. So about 5MB per 180k full-feed, I think).

Here are figures for a Quagga bgpd that is roughly similar to your config, they are from the RIPE NCC "RIS" routing-data collector project (see http://www.ris.ripe.net):

- older Quagga, 0.96.5, but I believe it should still be representative
- just under 100 peers
- All but a few sessions are several weeks old
- More than 5 sessions are up for *many* months
- not forwarding, hence not running zebra
  (explains previous line, given the version)
- 8 full ~180k feeds,
- slightly fewer distinct AS_PATH entries though than yours: ~256k

Memory usage is:

VmSize:   187252 kB
VmLck:         0 kB
VmRSS:    183072 kB
VmData:   184488 kB
VmStk:        36 kB
VmExe:       596 kB
VmLib:      1972 kB

187MB, versus the 150MB for OpenBGPd. So there's a 20% difference from OpenBGPd -> Quagga. That's certainly still comparable at least.

Here's one with:

- order of 150ish peers
- most sessions are up for several weeks
- again, not forwarding, not running zebra
- 13 full ~180k IPv4 feeds
- also 0.96.5
- about 55% more AS_PATH attributes (~400k):

VmSize:   278992 kB
VmLck:         0 kB
VmRSS:    262996 kB
VmData:   276224 kB
VmStk:        40 kB
VmExe:       596 kB
VmLib:      1972 kB

Memory usage is 55% greater, nearly exactly in line with the AS_PATH attribute difference.

Now, there's actually some really 'low-hanging fruit'; junk that shouldn't be in our core data structures and wasted padding due to poor layout. That appears to be good for roughly a ~6% reduction in size on ILP32 machines, possibly a bit more on LP64, and I'll commit the changes soon.

That should get us down a /little/ bit more to OpenBGPd memory-usage levels, maybe within 15%. But still, our current memory usage is *not* that dreadful, not even compared to OpenBGPd, imho.

Despite this, the presentations ye (OpenBGPd developers) have been giving at various forums paint a slightly different picture: "Quagga is bloated" and "OpenBGPd is really memory efficient" "25MB for two full feeds", "not even half of other implementations" to paraphrase some things I've read/heard from presentations. E.g., see:

        http://ezine.daemonnews.org/200603/openbgpd.html

from Henning's talk at NANOG recently.

I'm not sure those characterisations are /quite/ fair.

They don't seem to be representative even of OpenBGPd, other than possibly as 'best case' and I think you quite definitely misrepresent the cases of operational reality (ie soft-reconfig) and of other implementations, unintentionally or not.

It might be an idea in future to provide details about the characteristics of the feeds used when you provide memory usage figures to audiences (distinct AS_PATHs, number of prefixes, distinct attributes, etc. anything which is a significant influence on memory usage).

Another urban legend is the coolness of "dynamic route refresh".
OpenBGPD does announce the route refresh capability and supports request
from other peers but we do not have the button to make a refresh request
ourself. There is a simple reason why. Because "route refresh" as in RFC
2918 is totaly useless.

Route-refresh has some shortcomings, it's hardly useless though.

Yes, you have to trust your peer resends its table, but there are many other things you have to trust your peer to do correctly (including sending you the original updates in the first place..).

Is there operational experience to back up this "unuseable" claim? My direct experience with Quagga is that RR is quite useable, and I know of others who rely solely on RR. Looking at the OpenBGPd code, you seem quite capable of reliably sending table dumps on RR and presumably could easily accept them too.

Is it maybe considered useless because other, probably early, implementation(s) of RR got it wrong?

Now that's great you don't know when the refresh is finished. It is even worse you don't know if it worked at all in the first place.

So the long stream of UPDATEs for 175k+ prefixes isn't a clue? :)

You can actually tell whether routes you accepted previously have been resent or not, you need to keep just one bit of information for each route in the RIB. If you really had to know (not that I'd advocate this being worthwhile).

Can't detect other routes though, no.

The End-of-RIB marker introduced in the BGP Graceful-Restart proposal should have bin in RFC 2918 from the beginning.

ACK, it should be split out as a seperate draft and a standalone capability, not tied in with the GR capability. It could be done relatively easily. And would be great (I have at least one other prospective use for it).

I wonder if any other markers might be needed, e.g. I wonder if Start-of-Rib would be an idea too (otherwise possibly a /very/ long timer to time-out on not receiving EoR). Needs thought.

Fancy working on this? Figure out exactly what markers would be useful for Quagga and OpenBGPd, and what semantics they should have? Could be very useful.

We consider the use of RFC 2918 as a substitution of real inbound soft reconfiguration as unusable until the named issues have been solved.

Adding a knob to bgpctl to issue a refresh request is not a big issue.

So add it, the users will figure out which form of reconfig best suits their needs. :)

Users with tight memory-constraints may well prefer route-refresh over soft-reconfig, despite the flaws.

Thanks for your reply!

regards,
--
Paul Jakma,
Network Approachability, KISS.          http://quagga.ireland.sun.com/
Sun Microsystems, Dublin, Ireland.      tel: EMEA x19190 / +353 1 819 9190
_______________________________________________
networking-discuss mailing list
[email protected]

Reply via email to