Re: CARP node crashing reproducibly (4.3-stable)

2008-07-23 Thread Stephan A. Rickauer
On Mon, 2008-07-14 at 17:38 +0200, Henning Brauer wrote: > > I have increased kern.maxclusters to gain more time for debugging of the > > memory leak. However, all I could find out so far is that lots of mbufs > > are allocated while there is no significant traffic to be handled > > (remember the m

Re: CARP node crashing reproducibly (4.3-stable)

2008-07-14 Thread Henning Brauer
* Stephan A. Rickauer <[EMAIL PROTECTED]> [2008-07-14 17:27]: > On Mon, 2008-07-14 at 14:22 +0200, Henning Brauer wrote: > > perfect analysis! > > > > looks like the only sane thing to do in that case is to bail and not > > send the icmp. > > I've compiled a new kernel with the patch. The machine

Re: CARP node crashing reproducibly (4.3-stable)

2008-07-14 Thread Stephan A. Rickauer
On Mon, 2008-07-14 at 14:22 +0200, Henning Brauer wrote: > perfect analysis! > > looks like the only sane thing to do in that case is to bail and not > send the icmp. I've compiled a new kernel with the patch. The machine is no longer crashing on pf_send_icmp(). However, I now see memory leaking

Re: CARP node crashing reproducibly (4.3-stable)

2008-07-14 Thread Henning Brauer
* Adrian M. Whatley <[EMAIL PROTECTED]> [2008-07-14 13:54]: > It's a NULL pointer bug! > which is from line 1726 in pf_send_icmp() in pf.c: > > m0->m_pkthdr.pf.flags |= PF_TAG_GENERATED; > Looking at m_copym0, it looks like it can legitimately fail and return > NULL (it even increments a gl

Re: CARP node crashing reproducibly (4.3-stable)

2008-07-14 Thread Adrian M. Whatley
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Henning Brauer wrote: | * Stephan A. Rickauer <[EMAIL PROTECTED]> [2008-07-11 16:59]: |> Here's all data I was able to get off our crashing machine, the backup |> node of our CARP cluster, that used to run flawlessly since 3.7. |> |> We can reproduce

Re: CARP node crashing reproducibly (4.3-stable)

2008-07-11 Thread Stephan A. Rickauer
On Fri, 2008-07-11 at 21:32 +0200, Henning Brauer wrote: > * Stephan A. Rickauer <[EMAIL PROTECTED]> [2008-07-11 16:59]: > > Here's all data I was able to get off our crashing machine, the backup > > node of our CARP cluster, that used to run flawlessly since 3.7. > > > > We can reproduce the prob

Re: CARP node crashing reproducibly (4.3-stable)

2008-07-11 Thread Henning Brauer
* Stephan A. Rickauer <[EMAIL PROTECTED]> [2008-07-11 16:59]: > Here's all data I was able to get off our crashing machine, the backup > node of our CARP cluster, that used to run flawlessly since 3.7. > > We can reproduce the problem if you follow http://www.benzedrine.cx/crashreport.html we hav

Re: CARP node crashing reproducibly (4.3-stable)

2008-07-11 Thread Giancarlo Razzolini
Stephan A. Rickauer escreveu: > On Fri, 2008-07-11 at 17:09 +0200, Reyk Floeter wrote: > >> hi stephan! >> > > o;?That was quick! Hi Reyk. > > >> can you also show your carp configuration? >> > > Sure (just x'ed out the external IPs as well as passwords). We have a > simple master/b

Re: CARP node crashing reproducibly (4.3-stable)

2008-07-11 Thread Stephan A. Rickauer
On Fri, 2008-07-11 at 17:09 +0200, Reyk Floeter wrote: > hi stephan! o;?That was quick! Hi Reyk. > can you also show your carp configuration? Sure (just x'ed out the external IPs as well as passwords). We have a simple master/backup system: carp0: LAN carp1: DMZ carp2: WLAN carp3: Internet # c

Re: CARP node crashing reproducibly (4.3-stable)

2008-07-11 Thread Reyk Floeter
hi stephan! can you also show your carp configuration? reyk On Fri, Jul 11, 2008 at 04:55:33PM +0200, Stephan A. Rickauer wrote: > Hello, > > Here's all data I was able to get off our crashing machine, the backup > node of our CARP cluster, that used to run flawlessly since 3.7. > > We can rep

CARP node crashing reproducibly (4.3-stable)

2008-07-11 Thread Stephan A. Rickauer
Hello, Here's all data I was able to get off our crashing machine, the backup node of our CARP cluster, that used to run flawlessly since 3.7. We can reproduce the problem by (no joke) installing an openSUSE 10.3 machine in one of our labs over the network. After 40 minutes, our backup firewall c