Re: OpenBGPD fatal in RDE : cannot allocate memory

2010-11-30 Thread Otto Moerbeek
On Tue, Nov 30, 2010 at 08:35:46AM +0100, Xavier Beaudouin wrote:

 Hello,
 
 I have updated a openbgpd router from OpenBSD 4.7 i386 to 4.8 amd64.
 
 Now I have new instability like this :
 
 Nov 29 21:25:22 core-3 bgpd[28895]: fatal in RDE: path_alloc: Cannot allocate
 memory
 Nov 30 02:01:47 core-3 bgpd[5522]: fatal in RDE: up_generate: Cannot allocate
 memory
 
 I have 2Gb on this machine and login.conf like this :
 
 default:\
 :path=/usr/bin /bin /usr/sbin /sbin /usr/X11R6/bin /usr/local/bin:\
 :umask=022:\
 :datasize-max=1512M:\
 :datasize-cur=1024M:\
 :maxproc-max=2048:\
 :maxproc-cur=1024:\
 :openfiles-cur=1024:\
 :stacksize-cur=4M:\
 :localcipher=blowfish,6:\
 :ypcipher=old:\
 :tc=auth-defaults:\
 :tc=auth-ftp-defaults:
 
 This currently make me mad, because this router handle more than 130 peers and
 is still unstable.
 
 What is needed to make openbgpd work as it should and shuttup ?
 
 (I am going to add a monit... because on production day this is not
 acceptable).
 
 Xavier

By default daemons run in the daemon login class.  Check that, also
check if you do not have stale /etc/login.conf.db file lying around. 

AFAIK I know, bgpd does not increase its limits to the max, so it does
not make sense to have different values for -max and -cur.

If these things don't help, analyzing this requires some specific bgpd
knowledge, which I do not have. 

-Otto



Re: OpenBGPD fatal in RDE : cannot allocate memory

2010-11-30 Thread Stuart Henderson
On 2010-11-30, Xavier Beaudouin k...@oav.net wrote:
 Hello,

 I have updated a openbgpd router from OpenBSD 4.7 i386 to 4.8 amd64.

 Now I have new instability like this :

 Nov 29 21:25:22 core-3 bgpd[28895]: fatal in RDE: path_alloc: Cannot allocate
 memory
 Nov 30 02:01:47 core-3 bgpd[5522]: fatal in RDE: up_generate: Cannot allocate
 memory

Is this box acting as a route-reflector?



Re: OpenBGPD fatal in RDE : cannot allocate memory

2010-11-30 Thread Claudio Jeker
On Tue, Nov 30, 2010 at 10:13:13AM +0100, Otto Moerbeek wrote:
 On Tue, Nov 30, 2010 at 08:35:46AM +0100, Xavier Beaudouin wrote:
 
  Hello,
  
  I have updated a openbgpd router from OpenBSD 4.7 i386 to 4.8 amd64.
  
  Now I have new instability like this :
  
  Nov 29 21:25:22 core-3 bgpd[28895]: fatal in RDE: path_alloc: Cannot 
  allocate
  memory
  Nov 30 02:01:47 core-3 bgpd[5522]: fatal in RDE: up_generate: Cannot 
  allocate
  memory
  
  I have 2Gb on this machine and login.conf like this :
  
  default:\
  :path=/usr/bin /bin /usr/sbin /sbin /usr/X11R6/bin /usr/local/bin:\
  :umask=022:\
  :datasize-max=1512M:\
  :datasize-cur=1024M:\
  :maxproc-max=2048:\
  :maxproc-cur=1024:\
  :openfiles-cur=1024:\
  :stacksize-cur=4M:\
  :localcipher=blowfish,6:\
  :ypcipher=old:\
  :tc=auth-defaults:\
  :tc=auth-ftp-defaults:
  
  This currently make me mad, because this router handle more than 130 peers 
  and
  is still unstable.
  
  What is needed to make openbgpd work as it should and shuttup ?
  
  (I am going to add a monit... because on production day this is not
  acceptable).
  
  Xavier
 
 By default daemons run in the daemon login class.  Check that, also
 check if you do not have stale /etc/login.conf.db file lying around. 
 
 AFAIK I know, bgpd does not increase its limits to the max, so it does
 not make sense to have different values for -max and -cur.
 
 If these things don't help, analyzing this requires some specific bgpd
 knowledge, which I do not have. 
 

Maybe it is time to change the default datalimit in the RDE. So maybe
something like this may help.
bgpd needs quite a bit more (temporary) memory when running with
softreconfig. A lot of additional memory is needed on reloads and when
large sessions flap that cause a lot of UPDATE messages.

Side note: bgpd on amd64 needs quite a bit more memory then i386 because
of the 64bit pointers.
-- 
:wq Claudio

Index: rde.c
===
RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
retrieving revision 1.302
diff -u -p -r1.302 rde.c
--- rde.c   24 Nov 2010 00:58:10 -  1.302
+++ rde.c   30 Nov 2010 10:12:56 -
@@ -18,6 +18,8 @@
 
 #include sys/types.h
 #include sys/socket.h
+#include sys/time.h
+#include sys/resource.h
 
 #include errno.h
 #include ifaddrs.h
@@ -156,6 +158,7 @@ pid_t
 rde_main(int pipe_m2r[2], int pipe_s2r[2], int pipe_m2s[2], int pipe_s2rctl[2],
 int debug)
 {
+   struct rlimitrl;
pid_tpid;
struct passwd   *pw;
struct pollfd   *pfd = NULL;
@@ -184,6 +187,13 @@ rde_main(int pipe_m2r[2], int pipe_s2r[2
 
setproctitle(route decision engine);
bgpd_process = PROC_RDE;
+
+   if (getrlimit(RLIMIT_DATA, rl) == -1)
+   fatal(getrlimit);
+   rl.rlim_cur = RLIM_INFINITY;
+   rl.rlim_max = RLIM_INFINITY;
+   if (setrlimit(RLIMIT_DATA, rl) == -1)
+   fatal(setrlimit);
 
if (setgroups(1, pw-pw_gid) ||
setresgid(pw-pw_gid, pw-pw_gid, pw-pw_gid) ||



Re: OpenBGPD fatal in RDE : cannot allocate memory

2010-11-30 Thread Otto Moerbeek
On Tue, Nov 30, 2010 at 11:25:41AM +0100, Claudio Jeker wrote:

 On Tue, Nov 30, 2010 at 10:13:13AM +0100, Otto Moerbeek wrote:
  On Tue, Nov 30, 2010 at 08:35:46AM +0100, Xavier Beaudouin wrote:
  
   Hello,
   
   I have updated a openbgpd router from OpenBSD 4.7 i386 to 4.8 amd64.
   
   Now I have new instability like this :
   
   Nov 29 21:25:22 core-3 bgpd[28895]: fatal in RDE: path_alloc: Cannot 
   allocate
   memory
   Nov 30 02:01:47 core-3 bgpd[5522]: fatal in RDE: up_generate: Cannot 
   allocate
   memory
   
   I have 2Gb on this machine and login.conf like this :
   
   default:\
   :path=/usr/bin /bin /usr/sbin /sbin /usr/X11R6/bin 
   /usr/local/bin:\
   :umask=022:\
   :datasize-max=1512M:\
   :datasize-cur=1024M:\
   :maxproc-max=2048:\
   :maxproc-cur=1024:\
   :openfiles-cur=1024:\
   :stacksize-cur=4M:\
   :localcipher=blowfish,6:\
   :ypcipher=old:\
   :tc=auth-defaults:\
   :tc=auth-ftp-defaults:
   
   This currently make me mad, because this router handle more than 130 
   peers and
   is still unstable.
   
   What is needed to make openbgpd work as it should and shuttup ?
   
   (I am going to add a monit... because on production day this is not
   acceptable).
   
   Xavier
  
  By default daemons run in the daemon login class.  Check that, also
  check if you do not have stale /etc/login.conf.db file lying around. 
  
  AFAIK I know, bgpd does not increase its limits to the max, so it does
  not make sense to have different values for -max and -cur.
  
  If these things don't help, analyzing this requires some specific bgpd
  knowledge, which I do not have. 
  
 
 Maybe it is time to change the default datalimit in the RDE. So maybe
 something like this may help.
 bgpd needs quite a bit more (temporary) memory when running with
 softreconfig. A lot of additional memory is needed on reloads and when
 large sessions flap that cause a lot of UPDATE messages.
 
 Side note: bgpd on amd64 needs quite a bit more memory then i386 because
 of the 64bit pointers.

Two questions: 

- why the getrlimit() if you are seting both cur and max?

- isn't it better to set cur to max?  Running with no bounds feels not ok.

-Otto



 -- 
 :wq Claudio
 
 Index: rde.c
 ===
 RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
 retrieving revision 1.302
 diff -u -p -r1.302 rde.c
 --- rde.c 24 Nov 2010 00:58:10 -  1.302
 +++ rde.c 30 Nov 2010 10:12:56 -
 @@ -18,6 +18,8 @@
  
  #include sys/types.h
  #include sys/socket.h
 +#include sys/time.h
 +#include sys/resource.h
  
  #include errno.h
  #include ifaddrs.h
 @@ -156,6 +158,7 @@ pid_t
  rde_main(int pipe_m2r[2], int pipe_s2r[2], int pipe_m2s[2], int 
 pipe_s2rctl[2],
  int debug)
  {
 + struct rlimitrl;
   pid_tpid;
   struct passwd   *pw;
   struct pollfd   *pfd = NULL;
 @@ -184,6 +187,13 @@ rde_main(int pipe_m2r[2], int pipe_s2r[2
  
   setproctitle(route decision engine);
   bgpd_process = PROC_RDE;
 +
 + if (getrlimit(RLIMIT_DATA, rl) == -1)
 + fatal(getrlimit);
 + rl.rlim_cur = RLIM_INFINITY;
 + rl.rlim_max = RLIM_INFINITY;
 + if (setrlimit(RLIMIT_DATA, rl) == -1)
 + fatal(setrlimit);
  
   if (setgroups(1, pw-pw_gid) ||
   setresgid(pw-pw_gid, pw-pw_gid, pw-pw_gid) ||



Re: OpenBGPD fatal in RDE : cannot allocate memory

2010-11-30 Thread Xavier Beaudouin
Hello,

Le 30 nov. 2010 ` 11:03, Stuart Henderson a icrit :

 On 2010-11-30, Xavier Beaudouin k...@oav.net wrote:
 Hello,

 I have updated a openbgpd router from OpenBSD 4.7 i386 to 4.8 amd64.

 Now I have new instability like this :

 Nov 29 21:25:22 core-3 bgpd[28895]: fatal in RDE: path_alloc: Cannot
allocate
 memory
 Nov 30 02:01:47 core-3 bgpd[5522]: fatal in RDE: up_generate: Cannot
allocate
 memory

 Is this box acting as a route-reflector?



No route reflector at all.

It is a peering box with 3 IX, on transit and 3 ibgp session (count 6, because
I use IPv6).

Configuration of this box on demand .

Xavier



Re: OpenBGPD fatal in RDE : cannot allocate memory

2010-11-30 Thread Claudio Jeker
On Tue, Nov 30, 2010 at 12:06:38PM +0100, Otto Moerbeek wrote:
 On Tue, Nov 30, 2010 at 11:25:41AM +0100, Claudio Jeker wrote:
 
  On Tue, Nov 30, 2010 at 10:13:13AM +0100, Otto Moerbeek wrote:
   On Tue, Nov 30, 2010 at 08:35:46AM +0100, Xavier Beaudouin wrote:
   
Hello,

I have updated a openbgpd router from OpenBSD 4.7 i386 to 4.8 amd64.

Now I have new instability like this :

Nov 29 21:25:22 core-3 bgpd[28895]: fatal in RDE: path_alloc: Cannot 
allocate
memory
Nov 30 02:01:47 core-3 bgpd[5522]: fatal in RDE: up_generate: Cannot 
allocate
memory

I have 2Gb on this machine and login.conf like this :

default:\
:path=/usr/bin /bin /usr/sbin /sbin /usr/X11R6/bin 
/usr/local/bin:\
:umask=022:\
:datasize-max=1512M:\
:datasize-cur=1024M:\
:maxproc-max=2048:\
:maxproc-cur=1024:\
:openfiles-cur=1024:\
:stacksize-cur=4M:\
:localcipher=blowfish,6:\
:ypcipher=old:\
:tc=auth-defaults:\
:tc=auth-ftp-defaults:

This currently make me mad, because this router handle more than 130 
peers and
is still unstable.

What is needed to make openbgpd work as it should and shuttup ?

(I am going to add a monit... because on production day this is not
acceptable).

Xavier
   
   By default daemons run in the daemon login class.  Check that, also
   check if you do not have stale /etc/login.conf.db file lying around. 
   
   AFAIK I know, bgpd does not increase its limits to the max, so it does
   not make sense to have different values for -max and -cur.
   
   If these things don't help, analyzing this requires some specific bgpd
   knowledge, which I do not have. 
   
  
  Maybe it is time to change the default datalimit in the RDE. So maybe
  something like this may help.
  bgpd needs quite a bit more (temporary) memory when running with
  softreconfig. A lot of additional memory is needed on reloads and when
  large sessions flap that cause a lot of UPDATE messages.
  
  Side note: bgpd on amd64 needs quite a bit more memory then i386 because
  of the 64bit pointers.
 
 Two questions: 
 
 - why the getrlimit() if you are seting both cur and max?
 

Because I first thought of only setting cur.

 - isn't it better to set cur to max?  Running with no bounds feels not ok.
 

First I planned to do the same thing but then decided against it. But
maybe I'm wrong and it would be better to just set cur to max and hope
people have sensible max values.

-- 
:wq Claudio



Re: OpenBGPD fatal in RDE : cannot allocate memory

2010-11-30 Thread Xavier Beaudouin
Hi Claudio,

 Maybe it is time to change the default datalimit in the RDE. So maybe
 something like this may help.
 bgpd needs quite a bit more (temporary) memory when running with
 softreconfig. A lot of additional memory is needed on reloads and when
 large sessions flap that cause a lot of UPDATE messages.

 Side note: bgpd on amd64 needs quite a bit more memory then i386 because
 of the 64bit pointers.

Yeah... That's why I have 2G on this machine I hope this should be enougth
... 1G on i386 was ok... So...

I will tell you if this fix my problem... (if you don't hear me... so it can
be fixed...) ping me if you need a clear status.

Cheers.
Xavier

 --
 :wq Claudio

 Index: rde.c
 ===
 RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
 retrieving revision 1.302
 diff -u -p -r1.302 rde.c
 --- rde.c 24 Nov 2010 00:58:10 -  1.302
 +++ rde.c 30 Nov 2010 10:12:56 -
 @@ -18,6 +18,8 @@

 #include sys/types.h
 #include sys/socket.h
 +#include sys/time.h
 +#include sys/resource.h

 #include errno.h
 #include ifaddrs.h
 @@ -156,6 +158,7 @@ pid_t
 rde_main(int pipe_m2r[2], int pipe_s2r[2], int pipe_m2s[2], int
pipe_s2rctl[2],
 int debug)
 {
 + struct rlimitrl;
   pid_tpid;
   struct passwd   *pw;
   struct pollfd   *pfd = NULL;
 @@ -184,6 +187,13 @@ rde_main(int pipe_m2r[2], int pipe_s2r[2

   setproctitle(route decision engine);
   bgpd_process = PROC_RDE;
 +
 + if (getrlimit(RLIMIT_DATA, rl) == -1)
 + fatal(getrlimit);
 + rl.rlim_cur = RLIM_INFINITY;
 + rl.rlim_max = RLIM_INFINITY;
 + if (setrlimit(RLIMIT_DATA, rl) == -1)
 + fatal(setrlimit);

   if (setgroups(1, pw-pw_gid) ||
   setresgid(pw-pw_gid, pw-pw_gid, pw-pw_gid) ||



Re: OpenBGPD fatal in RDE : cannot allocate memory

2010-11-30 Thread Xavier Beaudouin
Hi Claudio,

Le 30 nov. 2010 ` 17:45, Xavier Beaudouin a icrit :

 Hi Claudio,

 Maybe it is time to change the default datalimit in the RDE. So maybe
 something like this may help.
 bgpd needs quite a bit more (temporary) memory when running with
 softreconfig. A lot of additional memory is needed on reloads and when
 large sessions flap that cause a lot of UPDATE messages.

 Side note: bgpd on amd64 needs quite a bit more memory then i386 because
 of the 64bit pointers.

 Yeah... That's why I have 2G on this machine I hope this should be
enougth
 ... 1G on i386 was ok... So...

 I will tell you if this fix my problem... (if you don't hear me... so it
can
 be fixed...) ping me if you need a clear status.


This patch, opens another problem, seems that FIB is not updated at all when
applied.

I reverted to openbgp 4.8 release.

:(
Xavier



Re: OpenBGPD fatal in RDE : cannot allocate memory

2010-11-30 Thread Claudio Jeker
On Tue, Nov 30, 2010 at 06:24:32PM +0100, Xavier Beaudouin wrote:
 Hi Claudio,
 
 Le 30 nov. 2010 ` 17:45, Xavier Beaudouin a icrit :
 
  Hi Claudio,
  
  Maybe it is time to change the default datalimit in the RDE. So maybe
  something like this may help.
  bgpd needs quite a bit more (temporary) memory when running with
  softreconfig. A lot of additional memory is needed on reloads and when
  large sessions flap that cause a lot of UPDATE messages.
  
  Side note: bgpd on amd64 needs quite a bit more memory then i386 because
  of the 64bit pointers.
  
  Yeah... That's why I have 2G on this machine I hope this should be 
  enougth
  ... 1G on i386 was ok... So...
  
  I will tell you if this fix my problem... (if you don't hear me... so it can
  be fixed...) ping me if you need a clear status.
  
 
 This patch, opens another problem, seems that FIB is not updated at all when 
 applied.
 
 I reverted to openbgp 4.8 release.
 

You sure you have
http://ftp.openbsd.org/pub/OpenBSD/patches/4.8/common/001_bgpd.patch
installed? Since that could be the cause of your problem.

-- 
:wq Claudio



Re: OpenBGPD fatal in RDE : cannot allocate memory

2010-11-30 Thread Xavier Beaudouin
Hi Claudio,

Le 30 nov. 2010 ` 19:38, Claudio Jeker a icrit :
 This patch, opens another problem, seems that FIB is not updated at all
when applied.

 I reverted to openbgp 4.8 release.


 You sure you have
 http://ftp.openbsd.org/pub/OpenBSD/patches/4.8/common/001_bgpd.patch
 installed? Since that could be the cause of your problem.

Both patch applied... Well I will see if those 2 patches fixes the problem.

Sincerly,
Xavier



OpenBGPD fatal in RDE : cannot allocate memory

2010-11-29 Thread Xavier Beaudouin
Hello,

I have updated a openbgpd router from OpenBSD 4.7 i386 to 4.8 amd64.

Now I have new instability like this :

Nov 29 21:25:22 core-3 bgpd[28895]: fatal in RDE: path_alloc: Cannot allocate
memory
Nov 30 02:01:47 core-3 bgpd[5522]: fatal in RDE: up_generate: Cannot allocate
memory

I have 2Gb on this machine and login.conf like this :

default:\
:path=/usr/bin /bin /usr/sbin /sbin /usr/X11R6/bin /usr/local/bin:\
:umask=022:\
:datasize-max=1512M:\
:datasize-cur=1024M:\
:maxproc-max=2048:\
:maxproc-cur=1024:\
:openfiles-cur=1024:\
:stacksize-cur=4M:\
:localcipher=blowfish,6:\
:ypcipher=old:\
:tc=auth-defaults:\
:tc=auth-ftp-defaults:

This currently make me mad, because this router handle more than 130 peers and
is still unstable.

What is needed to make openbgpd work as it should and shuttup ?

(I am going to add a monit... because on production day this is not
acceptable).

Xavier