Re: OpenBGPD fatal in RDE: rde_dispatch_imsg_session: imsg_get error: Cannot allocate memory

2020-06-30 Thread Laurent CARON

Le 30/06/2020 à 11:56, Claudio Jeker a écrit :

Can you check and monitor with ps aux | grep bgpd and or top the VSZ and
RSS of the RDE process. What is the maximum you notice. Also how do you
start bgpd? Make sure the limits from login.conf are actually applied
(using rcctl start should do that while doas bgpd would not).



Hi Claudio,

After restarting bgpd on 2 affected boxed, RAM usage is back to normal.

root 23427  0.0  0.1 79700 88548 ??  S  12:47PM 1:24.88 
/usr/sbin/bgpd
_bgpd    35700  0.8  2.5 2052496 2061292 ??  Sp 12:47PM 24:15.01 
bgpd: route decision engine (bgpd)
_bgpd    29969  0.8  0.0 21536 10684 ??  Sp 12:47PM    6:56.40 bgpd: 
session engine (bgpd)


What else apart from ps aux | grep bgpd can I give you to help 
troubleshoot this issue ?



Thanks



Re: OpenBGPD fatal in RDE: rde_dispatch_imsg_session: imsg_get error: Cannot allocate memory

2020-06-30 Thread Claudio Jeker
On Tue, Jun 30, 2020 at 10:23:07AM +0200, Laurent CARON wrote:
> Hi,
> 
> 
> I'm running a pretty busy OpenBGPd router (~250 bgp sessions) with 4 IPv4
> and 4 IPv6 full views, plus a few IX sessions.
> 
> 
> # bgpctl show rib mem
> RDE memory statistics
>     820983 IPv4 unicast network entries using 31.3M of memory
>     203228 IPv6 unicast network entries using 10.9M of memory
>    1935802 rib entries using 118M of memory
>    6348318 prefix entries using 775M of memory
>     728103 BGP path attribute entries using 50.0M of memory
>    and holding 6348318 references
>     464633 BGP AS-PATH attribute entries using 22.3M of memory
>    and holding 728103 references
>  29055 entries for 371905 BGP communities using 8.6M of memory
>    and holding 6348318 references
>  18541 BGP attributes entries using 724K of memory
>    and holding 1618379 references
>  18540 BGP attributes using 145K of memory
>  0 as-set elements in 0 tables using 0B of memory
>     64 prefix-set elements using 3.0K of memory
> RIB using 1008M of memory
> Sets using 3.0K of memory
> 
> RDE hash statistics
>     path hash: size 131072, 728103 entries
>     min 0 max 19 avg/std-dev = 5.555/2.268
>     aspath hash: size 131072, 464633 entries
>     min 0 max 17 avg/std-dev = 3.545/1.853
>     comm hash: size 16384, 29055 entries
>     min 0 max 8 avg/std-dev = 1.773/0.925
>     attr hash: size 16384, 18541 entries
>     min 0 max 8 avg/std-dev = 1.132/0.848
> 
> 
> More often than not the BGPd daemon is crashing (although having plenty of
> RAM (80G) on the server) with: /var/log/messages
> 
> fatal in RDE: rde_dispatch_imsg_session: imsg_get error: Cannot allocate
> memory
> 
> fatal in RDE: prefix_alloc: Cannot allocate memory
> 
> fatal in RDE: communities_copy: Cannot allocate memory
> 
> peer closed imsg connection
> main: Lost connection to RDE
> peer closed imsg connection
> SE: Lost connection to RDE
> peer closed imsg connection
> SE: Lost connection to RDE control
> Can't send message 57 to RDE, pipe closed
> last message repeated 12 times
> peer closed imsg connection
> SE: Lost connection to parent
> neighbor A.B.C.D (sas-v4-001): sending notification: Cease, administratively
> down
> 
> 
> :/etc/login.conf:
> 
> default:\
>     :path=/usr/bin /bin /usr/sbin /sbin /usr/X11R6/bin /usr/local/bin
> /usr/local/sbin:\
>     :umask=022:\
>     :datasize-max=768M:\
>     :datasize-cur=768M:\
>     :maxproc-max=256:\
>     :maxproc-cur=128:\
>     :openfiles-max=1024:\
>     :openfiles-cur=512:\
>     :stacksize-cur=4M:\
>     :localcipher=blowfish,a:\
>     :tc=auth-defaults:\
>     :tc=auth-ftp-defaults:
> 
> daemon:\
>     :ignorenologin:\
>     :datasize=infinity:\
>     :maxproc=infinity:\
>     :openfiles-max=1024:\
>     :openfiles-cur=128:\
>     :stacksize-cur=8M:\
>     :localcipher=blowfish,a:\
>     :tc=default:
> 
> bgpd:\
>     :openfiles=512:\
>     :tc=daemon:
> 
> How can I pinpoint the source of the problem ?
> 

Can you check and monitor with ps aux | grep bgpd and or top the VSZ and
RSS of the RDE process. What is the maximum you notice. Also how do you
start bgpd? Make sure the limits from login.conf are actually applied
(using rcctl start should do that while doas bgpd would not).

Cheers
-- 
:wq Claudio



OpenBGPD fatal in RDE: rde_dispatch_imsg_session: imsg_get error: Cannot allocate memory

2020-06-30 Thread Laurent CARON

Hi,


I'm running a pretty busy OpenBGPd router (~250 bgp sessions) with 4 
IPv4 and 4 IPv6 full views, plus a few IX sessions.



# bgpctl show rib mem
RDE memory statistics
    820983 IPv4 unicast network entries using 31.3M of memory
    203228 IPv6 unicast network entries using 10.9M of memory
   1935802 rib entries using 118M of memory
   6348318 prefix entries using 775M of memory
    728103 BGP path attribute entries using 50.0M of memory
   and holding 6348318 references
    464633 BGP AS-PATH attribute entries using 22.3M of memory
   and holding 728103 references
 29055 entries for 371905 BGP communities using 8.6M of memory
   and holding 6348318 references
 18541 BGP attributes entries using 724K of memory
   and holding 1618379 references
 18540 BGP attributes using 145K of memory
 0 as-set elements in 0 tables using 0B of memory
    64 prefix-set elements using 3.0K of memory
RIB using 1008M of memory
Sets using 3.0K of memory

RDE hash statistics
    path hash: size 131072, 728103 entries
    min 0 max 19 avg/std-dev = 5.555/2.268
    aspath hash: size 131072, 464633 entries
    min 0 max 17 avg/std-dev = 3.545/1.853
    comm hash: size 16384, 29055 entries
    min 0 max 8 avg/std-dev = 1.773/0.925
    attr hash: size 16384, 18541 entries
    min 0 max 8 avg/std-dev = 1.132/0.848


More often than not the BGPd daemon is crashing (although having plenty 
of RAM (80G) on the server) with: /var/log/messages


fatal in RDE: rde_dispatch_imsg_session: imsg_get error: Cannot allocate 
memory


fatal in RDE: prefix_alloc: Cannot allocate memory

fatal in RDE: communities_copy: Cannot allocate memory

peer closed imsg connection
main: Lost connection to RDE
peer closed imsg connection
SE: Lost connection to RDE
peer closed imsg connection
SE: Lost connection to RDE control
Can't send message 57 to RDE, pipe closed
last message repeated 12 times
peer closed imsg connection
SE: Lost connection to parent
neighbor A.B.C.D (sas-v4-001): sending notification: Cease, 
administratively down



:/etc/login.conf:

default:\
    :path=/usr/bin /bin /usr/sbin /sbin /usr/X11R6/bin 
/usr/local/bin /usr/local/sbin:\

    :umask=022:\
    :datasize-max=768M:\
    :datasize-cur=768M:\
    :maxproc-max=256:\
    :maxproc-cur=128:\
    :openfiles-max=1024:\
    :openfiles-cur=512:\
    :stacksize-cur=4M:\
    :localcipher=blowfish,a:\
    :tc=auth-defaults:\
    :tc=auth-ftp-defaults:

daemon:\
    :ignorenologin:\
    :datasize=infinity:\
    :maxproc=infinity:\
    :openfiles-max=1024:\
    :openfiles-cur=128:\
    :stacksize-cur=8M:\
    :localcipher=blowfish,a:\
    :tc=default:

bgpd:\
    :openfiles=512:\
    :tc=daemon:

How can I pinpoint the source of the problem ?


Thanks