Re: OpenBGPD fatal in RDE: rde_dispatch_imsg_session: imsg_get error: Cannot allocate memory
Le 30/06/2020 à 11:56, Claudio Jeker a écrit : Can you check and monitor with ps aux | grep bgpd and or top the VSZ and RSS of the RDE process. What is the maximum you notice. Also how do you start bgpd? Make sure the limits from login.conf are actually applied (using rcctl start should do that while doas bgpd would not). Hi Claudio, After restarting bgpd on 2 affected boxed, RAM usage is back to normal. root 23427 0.0 0.1 79700 88548 ?? S 12:47PM 1:24.88 /usr/sbin/bgpd _bgpd 35700 0.8 2.5 2052496 2061292 ?? Sp 12:47PM 24:15.01 bgpd: route decision engine (bgpd) _bgpd 29969 0.8 0.0 21536 10684 ?? Sp 12:47PM 6:56.40 bgpd: session engine (bgpd) What else apart from ps aux | grep bgpd can I give you to help troubleshoot this issue ? Thanks
Re: OpenBGPD fatal in RDE: rde_dispatch_imsg_session: imsg_get error: Cannot allocate memory
On Tue, Jun 30, 2020 at 10:23:07AM +0200, Laurent CARON wrote: > Hi, > > > I'm running a pretty busy OpenBGPd router (~250 bgp sessions) with 4 IPv4 > and 4 IPv6 full views, plus a few IX sessions. > > > # bgpctl show rib mem > RDE memory statistics > 820983 IPv4 unicast network entries using 31.3M of memory > 203228 IPv6 unicast network entries using 10.9M of memory > 1935802 rib entries using 118M of memory > 6348318 prefix entries using 775M of memory > 728103 BGP path attribute entries using 50.0M of memory > and holding 6348318 references > 464633 BGP AS-PATH attribute entries using 22.3M of memory > and holding 728103 references > 29055 entries for 371905 BGP communities using 8.6M of memory > and holding 6348318 references > 18541 BGP attributes entries using 724K of memory > and holding 1618379 references > 18540 BGP attributes using 145K of memory > 0 as-set elements in 0 tables using 0B of memory > 64 prefix-set elements using 3.0K of memory > RIB using 1008M of memory > Sets using 3.0K of memory > > RDE hash statistics > path hash: size 131072, 728103 entries > min 0 max 19 avg/std-dev = 5.555/2.268 > aspath hash: size 131072, 464633 entries > min 0 max 17 avg/std-dev = 3.545/1.853 > comm hash: size 16384, 29055 entries > min 0 max 8 avg/std-dev = 1.773/0.925 > attr hash: size 16384, 18541 entries > min 0 max 8 avg/std-dev = 1.132/0.848 > > > More often than not the BGPd daemon is crashing (although having plenty of > RAM (80G) on the server) with: /var/log/messages > > fatal in RDE: rde_dispatch_imsg_session: imsg_get error: Cannot allocate > memory > > fatal in RDE: prefix_alloc: Cannot allocate memory > > fatal in RDE: communities_copy: Cannot allocate memory > > peer closed imsg connection > main: Lost connection to RDE > peer closed imsg connection > SE: Lost connection to RDE > peer closed imsg connection > SE: Lost connection to RDE control > Can't send message 57 to RDE, pipe closed > last message repeated 12 times > peer closed imsg connection > SE: Lost connection to parent > neighbor A.B.C.D (sas-v4-001): sending notification: Cease, administratively > down > > > :/etc/login.conf: > > default:\ > :path=/usr/bin /bin /usr/sbin /sbin /usr/X11R6/bin /usr/local/bin > /usr/local/sbin:\ > :umask=022:\ > :datasize-max=768M:\ > :datasize-cur=768M:\ > :maxproc-max=256:\ > :maxproc-cur=128:\ > :openfiles-max=1024:\ > :openfiles-cur=512:\ > :stacksize-cur=4M:\ > :localcipher=blowfish,a:\ > :tc=auth-defaults:\ > :tc=auth-ftp-defaults: > > daemon:\ > :ignorenologin:\ > :datasize=infinity:\ > :maxproc=infinity:\ > :openfiles-max=1024:\ > :openfiles-cur=128:\ > :stacksize-cur=8M:\ > :localcipher=blowfish,a:\ > :tc=default: > > bgpd:\ > :openfiles=512:\ > :tc=daemon: > > How can I pinpoint the source of the problem ? > Can you check and monitor with ps aux | grep bgpd and or top the VSZ and RSS of the RDE process. What is the maximum you notice. Also how do you start bgpd? Make sure the limits from login.conf are actually applied (using rcctl start should do that while doas bgpd would not). Cheers -- :wq Claudio
OpenBGPD fatal in RDE: rde_dispatch_imsg_session: imsg_get error: Cannot allocate memory
Hi, I'm running a pretty busy OpenBGPd router (~250 bgp sessions) with 4 IPv4 and 4 IPv6 full views, plus a few IX sessions. # bgpctl show rib mem RDE memory statistics 820983 IPv4 unicast network entries using 31.3M of memory 203228 IPv6 unicast network entries using 10.9M of memory 1935802 rib entries using 118M of memory 6348318 prefix entries using 775M of memory 728103 BGP path attribute entries using 50.0M of memory and holding 6348318 references 464633 BGP AS-PATH attribute entries using 22.3M of memory and holding 728103 references 29055 entries for 371905 BGP communities using 8.6M of memory and holding 6348318 references 18541 BGP attributes entries using 724K of memory and holding 1618379 references 18540 BGP attributes using 145K of memory 0 as-set elements in 0 tables using 0B of memory 64 prefix-set elements using 3.0K of memory RIB using 1008M of memory Sets using 3.0K of memory RDE hash statistics path hash: size 131072, 728103 entries min 0 max 19 avg/std-dev = 5.555/2.268 aspath hash: size 131072, 464633 entries min 0 max 17 avg/std-dev = 3.545/1.853 comm hash: size 16384, 29055 entries min 0 max 8 avg/std-dev = 1.773/0.925 attr hash: size 16384, 18541 entries min 0 max 8 avg/std-dev = 1.132/0.848 More often than not the BGPd daemon is crashing (although having plenty of RAM (80G) on the server) with: /var/log/messages fatal in RDE: rde_dispatch_imsg_session: imsg_get error: Cannot allocate memory fatal in RDE: prefix_alloc: Cannot allocate memory fatal in RDE: communities_copy: Cannot allocate memory peer closed imsg connection main: Lost connection to RDE peer closed imsg connection SE: Lost connection to RDE peer closed imsg connection SE: Lost connection to RDE control Can't send message 57 to RDE, pipe closed last message repeated 12 times peer closed imsg connection SE: Lost connection to parent neighbor A.B.C.D (sas-v4-001): sending notification: Cease, administratively down :/etc/login.conf: default:\ :path=/usr/bin /bin /usr/sbin /sbin /usr/X11R6/bin /usr/local/bin /usr/local/sbin:\ :umask=022:\ :datasize-max=768M:\ :datasize-cur=768M:\ :maxproc-max=256:\ :maxproc-cur=128:\ :openfiles-max=1024:\ :openfiles-cur=512:\ :stacksize-cur=4M:\ :localcipher=blowfish,a:\ :tc=auth-defaults:\ :tc=auth-ftp-defaults: daemon:\ :ignorenologin:\ :datasize=infinity:\ :maxproc=infinity:\ :openfiles-max=1024:\ :openfiles-cur=128:\ :stacksize-cur=8M:\ :localcipher=blowfish,a:\ :tc=default: bgpd:\ :openfiles=512:\ :tc=daemon: How can I pinpoint the source of the problem ? Thanks