Hi Markus,
Thanks for the patch; makes sense to me and i see the benefit but
i need some test in lab before committing as it has its potential
danger ;-)
Btw, did you have a look to the config directives bgp_daemon_batch
and bgp_daemon_batch_interval? They allow to re-establish the BGP
peerings in a more "controlled" fashion, ie. a defineable few at a
time, precisely to tackle such scenarios.
Cheers,
Paolo
On Thu, Jul 23, 2015 at 01:25:45PM +0200, Markus Weber wrote:
>
> In heavy loaded environments [Paolo knows our probably better
> than me ;-] BGP sessions might run into timeouts during heavy
> routing updates (on startup it takes here -thanks to the fast
> box it's running on - 30 mins to load all tables).
>
> Reasons seems to be/is the processing loop in bgp.c, which
> handles incoming updates of peers first, which connected first,
> 'till there are no more updates available from them, and then -
> only then - get to handle the later connected peers.
>
> Attached is a small quick-and-dirty hack (against 1.5.0), which
> should fix this (still not a real round-robin fashion, unless
> you have as many peers as configured as max; so it still gives
> a preference to the first ones, but does not forget the later
> ones; more rr like would be to set peers_idx_rr to peers_idx+1
> before the break ...).
>
> Please let me know if you think this break anything ... if not,
> maybe worth merging it into the official release.
>
> Cheers,
> Markus
>
> Pricing Housing
> *** /home/ertools/source/pmacct-1.5.0/src/bgp/bgp.c-orig Thu Jul 23
> 10:17:34 2015
> --- /home/ertools/source/pmacct-1.5.0/src/bgp/bgp.c Thu Jul 23 10:24:18
> 2015
> ***************
> *** 57,62 ****
> --- 57,63 ----
> void skinny_bgp_daemon()
> {
> int slen, ret, rc, peers_idx, allowed;
> + int peers_idx_rr = 0;
> struct host_addr addr;
> struct bgp_header bhdr;
> struct bgp_peer *peer;
> ***************
> *** 507,514 ****
> /* We have something coming in: let's lookup which peer is thatl
> XXX: to be optimized */
> for (peer = NULL, peers_idx = 0; peers_idx <
> config.nfacctd_bgp_max_peers; peers_idx++) {
> ! if (peers[peers_idx].fd && FD_ISSET(peers[peers_idx].fd,
> &read_descs)) {
> ! peer = &peers[peers_idx];
> break;
> }
> }
> --- 508,516 ----
> /* We have something coming in: let's lookup which peer is thatl
> XXX: to be optimized */
> for (peer = NULL, peers_idx = 0; peers_idx <
> config.nfacctd_bgp_max_peers; peers_idx++) {
> ! if (peers[(peers_idx+peers_idx_rr)%config.nfacctd_bgp_max_peers].fd
> && FD_ISSET(peers[(peers_idx+peers_idx_rr)%config.nfacctd_bgp_max_peers].fd,
> &read_descs)) {
> ! peer = &peers[(peers_idx+peers_idx_rr)%config.nfacctd_bgp_max_peers];
> ! peers_idx_rr = (peers_idx_rr+1)%config.nfacctd_bgp_max_peers;
> break;
> }
> }
> _______________________________________________
> pmacct-discussion mailing list
> http://www.pmacct.net/#mailinglists
_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists