Re: kmem_malloc(4096): kmem_map too small: 536870912 total allocated

2005-10-26 Thread Vladimir Sharun
Confirmed, no more leaks. I put it 1 hour ago onto 1 production server,
and today evening I'll put it to another one. Every server serves up to 2
million SMTP connections per day: load are heavy. Both servers SMP.

Gleb Smirnoff wrote:
 GS>   please confirm that the attached patch fix your problem. The patch is 
relative
 GS> to src/sys tree.


-- 
UKR.NET Postmaster
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: kmem_malloc(4096): kmem_map too small: 536870912 total allocated

2005-10-25 Thread Gleb Smirnoff
On Tue, Oct 25, 2005 at 06:04:15PM +0200, Max Laier wrote:
M> On Tuesday 25 October 2005 17:00, Gleb Smirnoff wrote:
M> >   Vladimir,
M> >
M> >   please confirm that the attached patch fix your problem. The patch is
M> > relative to src/sys tree.
M> >
M> >   Kris, Christian, please review it. Thanks.
M> 
M> Are you sure it's safe to free the nlminfo struct before calling to fdfree() 
M> in exit1()?  It sounds like it might need the structure if there are pending 
M> locks?  Just a guess, though.

The locks are actually held on the server side, so we aren't leaking
them here. Anyway, the fix is directly from BSD/OS, from where
nlminfo have came from.

M> On a side note, there are some whitespace errors in and before 
M> nlminfo_release().

Thanks. I'll take this into account.

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: kmem_malloc(4096): kmem_map too small: 536870912 total allocated

2005-10-25 Thread Max Laier
On Tuesday 25 October 2005 17:00, Gleb Smirnoff wrote:
>   Vladimir,
>
>   please confirm that the attached patch fix your problem. The patch is
> relative to src/sys tree.
>
>   Kris, Christian, please review it. Thanks.

Are you sure it's safe to free the nlminfo struct before calling to fdfree() 
in exit1()?  It sounds like it might need the structure if there are pending 
locks?  Just a guess, though.

On a side note, there are some whitespace errors in and before 
nlminfo_release().

-- 
/"\  Best regards,  | [EMAIL PROTECTED]
\ /  Max Laier  | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | [EMAIL PROTECTED]
/ \  ASCII Ribbon Campaign  | Against HTML Mail and News


pgpvPsEBzghFs.pgp
Description: PGP signature


Re: kmem_malloc(4096): kmem_map too small: 536870912 total allocated

2005-10-25 Thread Gleb Smirnoff
  Vladimir,

  please confirm that the attached patch fix your problem. The patch is relative
to src/sys tree.

  Kris, Christian, please review it. Thanks.

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
Index: nfsclient/nfs_lock.c
===
RCS file: /home/ncvs/src/sys/nfsclient/nfs_lock.c,v
retrieving revision 1.40
diff -u -r1.40 nfs_lock.c
--- nfsclient/nfs_lock.c6 Dec 2004 08:31:32 -   1.40
+++ nfsclient/nfs_lock.c25 Oct 2005 14:51:11 -
@@ -62,9 +62,13 @@
 #include 
 #include 
 
+extern void (*nlminfo_release_p)(struct proc *p);
+
 MALLOC_DEFINE(M_NFSLOCK, "NFS lock", "NFS lock request");
+MALLOC_DEFINE(M_NLMINFO, "nlminfo", "NFS lock process structure");
 
 static int nfslockdans(struct thread *td, struct lockd_ans *ansp);
+static void nlminfo_release(struct proc *p);
 /*
  * 
  * A miniature device driver which the userland uses to talk to us.
@@ -194,6 +198,7 @@
printf("nfslock: pseudo-device\n");
mtx_init(&nfslock_mtx, "nfslock", NULL, MTX_DEF);
TAILQ_INIT(&nfslock_list);
+   nlminfo_release_p = nlminfo_release;
nfslock_dev = make_dev(&nfslock_cdevsw, 0,
UID_ROOT, GID_KMEM, 0600, _PATH_NFSLCKDEV);
return (0);
@@ -259,7 +264,7 @@
 */
if (p->p_nlminfo == NULL) {
MALLOC(p->p_nlminfo, struct nlminfo *,
-   sizeof(struct nlminfo), M_LOCKF, M_WAITOK | M_ZERO);
+   sizeof(struct nlminfo), M_NLMINFO, M_WAITOK | M_ZERO);
p->p_nlminfo->pid_start = p->p_stats->p_start;
timevaladd(&p->p_nlminfo->pid_start, &boottime);
}
@@ -381,3 +386,12 @@
return (0);
 }
 
+/*
+ * Free nlminfo attached to process.
+ */
+void
+nlminfo_release(struct proc *p)
+{  
+free(p->p_nlminfo, M_NLMINFO);
+p->p_nlminfo = NULL;
+}
Index: nfsclient/nlminfo.h
===
RCS file: /home/ncvs/src/sys/nfsclient/nlminfo.h,v
retrieving revision 1.2
diff -u -r1.2 nlminfo.h
--- nfsclient/nlminfo.h 18 Sep 2001 23:31:53 -  1.2
+++ nfsclient/nlminfo.h 25 Oct 2005 14:40:30 -
@@ -40,5 +40,3 @@
int getlk_pid;
 struct  timeval pid_start;  /* process starting time */
 };
-
-extern void nlminfo_release(struct proc *p);
Index: kern/kern_exit.c
===
RCS file: /home/ncvs/src/sys/kern/kern_exit.c,v
retrieving revision 1.268
diff -u -r1.268 kern_exit.c
--- kern/kern_exit.c23 Oct 2005 12:19:08 -  1.268
+++ kern/kern_exit.c25 Oct 2005 14:45:35 -
@@ -82,6 +82,9 @@
 /* Required to be non-static for SysVR4 emulator */
 MALLOC_DEFINE(M_ZOMBIE, "zombie", "zombie proc status");
 
+/* Hook for NFS teardown procedure. */
+void (*nlminfo_release_p)(struct proc *p);
+
 /*
  * exit --
  * Death of process.
@@ -234,6 +237,12 @@
funsetownlst(&p->p_sigiolst);
 
/*
+* If this process has an nlminfo data area (for lockd), release it
+*/
+   if (nlminfo_release_p != NULL && p->p_nlminfo != NULL)
+   (*nlminfo_release_p)(p);
+
+   /*
 * Close open files and release open-file table.
 * This may block!
 */
Index: sys/lockf.h
===
RCS file: /home/ncvs/src/sys/sys/lockf.h,v
retrieving revision 1.18
diff -u -r1.18 lockf.h
--- sys/lockf.h 25 Jan 2005 10:15:25 -  1.18
+++ sys/lockf.h 25 Oct 2005 14:51:28 -
@@ -40,10 +40,6 @@
 
 struct vop_advlock_args;
 
-#ifdef MALLOC_DECLARE
-MALLOC_DECLARE(M_LOCKF);
-#endif
-
 /*
  * The lockf structure is a kernel structure which contains the information
  * associated with a byte range lock.  The lockf structures are linked into
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: kmem_malloc(4096): kmem_map too small: 536870912 total allocated

2005-10-25 Thread Vladimir Sharun
Pete French wrote:
>> I found the sources of the leak: if exim accessess ANY configuration/text 
>> files over NFS, there will be leak. And, how often exim will be called, then
>> quicker your system dies.

 PF> Surely this has to be a problenm wth NFS in the kernel, not with exim
 PF> though? Did you log a FreeBSD PR on this ? I've been following your 
tracking
 PF> of it with interest, and I don't want it to get lost in the noise!

So, the test:

#include 
#include 
#include 

int main() {
  int lockfd;
char* tempfile="/media/testfile";
 lockfd=open(tempfile,O_CREAT);
 printf("Open errno:   %d\n",errno);
 if (flock(lockfd, LOCK_SH|LOCK_NB)==-1) {
 printf("ERROR shared lock:  %d\n",errno); }
  if (flock(lockfd, LOCK_EX|LOCK_NB)==-1) {  
printf("ERROR exclusive lock:  %d\n",errno); }
   close(lockfd); }

If *ANY* lock will be transfered over a wire, memory after them will not be 
freed.
Regardless to errno. Even in case, where we do not start rpc.lockd 
(errno 45) or correctly start locking, in any case leak will be there. The only
way to avoid leak is mount NFS share with -L (do not transfer fcntl() locks 
over 
a wire). 
Systems affected: any FreeBSD 6.0 up to rc1.

Whoooh! 30 hours of investigations ;-)

-- 
UKR.NET Postmaster
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: kmem_malloc(4096): kmem_map too small: 536870912 total allocated

2005-10-25 Thread Pete French
> I found the sources of the leak: if exim accessess ANY configuration/text 
> files over NFS, there will be leak. And, how often exim will be called, then
> quicker your system dies.

Surely this has to be a problenm wth NFS in the kernel, not with exim
though? Did you log a FreeBSD PR on this ? I've been following your tracking
of it with interest, and I don't want it to get lost in the noise!

-pcf.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: kmem_malloc(4096): kmem_map too small: 536870912 total allocated

2005-10-25 Thread Vladimir Sharun
I found the sources of the leak: if exim accessess ANY configuration/text 
files over NFS, there will be leak. And, how often exim will be called, then
quicker your system dies.

My main problem now is to build near-realtime mirroring solution nfs-to-local
for around 20 files (up to 1Mb everything). Any /ports solution ?

The next question to Philip Hazel: any comments why this happens ?

Vladimir Sharun wrote:
 VS> We have 2xOpteron/2Gb RAM server with extensive disk load. Every week or 
two
 VS> it suddenly hangs with "kmem_malloc(4096): kmem_map too small 335bla-bla 
allocated". 
 VS> I look onto handbook and put vm.kmem_size_max="536870912" onto 
/boot/loader.conf.
 VS> Today was the same with the new parameters. Is there any other solutions ?

 VS> # sysctl -a | grep kmem
 VS> vm.kmem_size: 536870912
 VS> vm.kmem_size_max: 536870912
 VS> vm.kmem_size_scale: 3

 VS> The only vm.kmem_size_max on loader.conf, no vm.kmem_size.

 VS> We're running FreeBSD 6.0-BETA5 #0: Wed Sep 28 16:54:33 EEST 2005
 VS> in i386 mode. The same was with 5.3/5.4 and NetBSD 2.0 on this machine.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: kmem_malloc(4096): kmem_map too small: 536870912 total allocated

2005-10-24 Thread Vladimir Sharun
Kris Kennaway wrote:
>> Looks like kernel leak (thanks for tip to Gleb Smirnov) in lockf.
>> # vmstat -zm | grep lock
>> lockf 2257779 70556K   - 19476940  32,64
>> ... and keep raising.
>> 
>> That's another one machine with 1Gb RAM, having 512M for vm.kmem_size_max 
>> too.

 KK> OK, what version was this again?

6.0-BETA5. The same for RC1.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: kmem_malloc(4096): kmem_map too small: 536870912 total allocated

2005-10-24 Thread Kris Kennaway

On Sun, Oct 23, 2005 at 08:07:09PM +0300, Vladimir Sharun wrote:
> Kris Kennaway wrote:
> > >>> We have 2xOpteron/2Gb RAM server with extrensive disk load. Every week 
> > >>> or two
> > >>> it suddenly hangs with "kmem_malloc(4096): kmem_map too small 
> > >>> 335bla-bla allocated". 
> > >>> I look onto handbook and put vm.kmem_size_max="536870912" onto 
> > >>> /boot/loader.conf.
> > >>> Today was the same with the new parameters. Is there any other 
> > >>> solutions ?
> >> 
> >  KK>> If that's not enough, try making it larger.
> >> 
> >> On what size we can be sure, that "that's enough" ?
> 
>  KK> It depends on your workload, so keep increasing it until the problem
>  KK> stops or you discover that you need more RAM to handle your workload.
> 
> Looks like kernel leak (thanks for tip to Gleb Smirnov) in lockf.
> # vmstat -zm | grep lock
> lockf 2257779 70556K   - 19476940  32,64
> ... and keep raising.
> 
> That's another one machine with 1Gb RAM, having 512M for vm.kmem_size_max too.

OK, what version was this again?

Kris


pgpbBIVeVhYn4.pgp
Description: PGP signature


Re: kmem_malloc(4096): kmem_map too small: 536870912 total allocated

2005-10-24 Thread Vladimir Sharun
Leak in lockf, confirmed:
lockf 80448  2516K   -   709113  32,64
lockf 80450  2516K   -   709155  32,64
lockf 80453  2516K   -   709199  32,64
lockf 80452  2516K   -   709207  32,64
lockf 80455  2516K   -   709226  32,64
lockf 80455  2516K   -   709236  32,64
lockf 80459  2516K   -   709250  32,64
lockf 80461  2516K   -   709280  32,64
lockf 80466  2516K   -   709317  32,64
lockf 80474  2516K   -   709376  32,64
lockf 80475  2516K   -   709396  32,64
lockf 80477  2517K   -   709427  32,64
lockf 80481  2517K   -   709445  32,64
lockf 80482  2517K   -   709472  32,64
lockf 80484  2517K   -   709488  32,64
lockf 80490  2517K   -   709547  32,64
lockf 80498  2517K   -   709578  32,64
lockf 80505  2518K   -   709615  32,64
lockf 80507  2518K   -   709647  32,64
lockf 80510  2518K   -   709700  32,64
lockf 80518  2518K   -   709747  32,64
lockf 80531  2518K   -   709865  32,64
lockf 80540  2519K   -   709940  32,64
lockf 80561  2519K   -   710078  32,64
lockf 80590  2520K   -   710263  32,64
lockf 80611  2521K   -   710419  32,64
lockf 80623  2521K   -   710512  32,64
lockf 80625  2521K   -   710530  32,64
lockf 80637  2522K   -   710596  32,64
lockf 80638  2521K   -   710643  32,64
lockf 80641  2522K   -   710681  32,64
lockf 80656  2522K   -   710769  32,64
lockf 80658  2522K   -   710803  32,64
lockf 80666  2522K   -   710859  32,64
lockf 80672  2523K   -   710899  32,64
lockf 80675  2523K   -   710930  32,64
(output from while true; do vmstat -m | grep lockf; sleep 1 ; done)

Vladimir Sharun wrote:
 VS> We have 2xOpteron/2Gb RAM server with extrensive disk load. Every week or 
two
 VS> it suddenly hangs with "kmem_malloc(4096): kmem_map too small 335bla-bla 
allocated". 
 VS> I look onto handbook and put vm.kmem_size_max="536870912" onto 
/boot/loader.conf.
 VS> Today was the same with the new parameters. Is there any other solutions ?

 VS> # sysctl -a | grep kmem
 VS> vm.kmem_size: 536870912
 VS> vm.kmem_size_max: 536870912
 VS> vm.kmem_size_scale: 3

 VS> The only vm.kmem_size_max on loader.conf, no vm.kmem_size.

 VS> We're running FreeBSD 6.0-BETA5 #0: Wed Sep 28 16:54:33 EEST 2005
 VS> in i386 mode. The same was with 5.3/5.4 and NetBSD 2.0 on this machine.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: kmem_malloc(4096): kmem_map too small: 536870912 total allocated

2005-10-23 Thread Vladimir Sharun
Kris Kennaway wrote:
> >>> We have 2xOpteron/2Gb RAM server with extrensive disk load. Every week or 
> >>> two
> >>> it suddenly hangs with "kmem_malloc(4096): kmem_map too small 335bla-bla 
> >>> allocated". 
> >>> I look onto handbook and put vm.kmem_size_max="536870912" onto 
> >>> /boot/loader.conf.
> >>> Today was the same with the new parameters. Is there any other solutions ?
>> 
>  KK>> If that's not enough, try making it larger.
>> 
>> On what size we can be sure, that "that's enough" ?

 KK> It depends on your workload, so keep increasing it until the problem
 KK> stops or you discover that you need more RAM to handle your workload.

Looks like kernel leak (thanks for tip to Gleb Smirnov) in lockf.
# vmstat -zm | grep lock
lockf 2257779 70556K   - 19476940  32,64
... and keep raising.

That's another one machine with 1Gb RAM, having 512M for vm.kmem_size_max too.

-- 
UKR.NET Postmaster
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: kmem_malloc(4096): kmem_map too small: 536870912 total allocated

2005-10-23 Thread Kris Kennaway
On Sun, Oct 23, 2005 at 12:05:26PM +0300, Vladimir Sharun wrote:
> Kris Kennaway wrote:
> >> We have 2xOpteron/2Gb RAM server with extrensive disk load. Every week or 
> >> two
> >> it suddenly hangs with "kmem_malloc(4096): kmem_map too small 335bla-bla 
> >> allocated". 
> >> I look onto handbook and put vm.kmem_size_max="536870912" onto 
> >> /boot/loader.conf.
> >> Today was the same with the new parameters. Is there any other solutions ?
> 
>  KK> If that's not enough, try making it larger.
> 
> The second issue: I can't set it too high: 700M and more result to kernel 
> panic
> during boot. 600M is acceptable.

That probably indicates you're running close to the point where you
need more RAM in your system.

Kris


pgpw0E6L5mgsZ.pgp
Description: PGP signature


Re: kmem_malloc(4096): kmem_map too small: 536870912 total allocated

2005-10-23 Thread Vladimir Sharun
Kris Kennaway wrote:
>> We have 2xOpteron/2Gb RAM server with extrensive disk load. Every week or two
>> it suddenly hangs with "kmem_malloc(4096): kmem_map too small 335bla-bla 
>> allocated". 
>> I look onto handbook and put vm.kmem_size_max="536870912" onto 
>> /boot/loader.conf.
>> Today was the same with the new parameters. Is there any other solutions ?

 KK> If that's not enough, try making it larger.

The second issue: I can't set it too high: 700M and more result to kernel panic
during boot. 600M is acceptable.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: kmem_malloc(4096): kmem_map too small: 536870912 total allocated

2005-10-23 Thread Kris Kennaway
On Sun, Oct 23, 2005 at 11:21:10AM +0300, Vladimir Sharun wrote:
> Kris Kennaway wrote:
>  KK> On Sun, Oct 23, 2005 at 10:43:42AM +0300, Vladimir Sharun wrote:
> >> We have 2xOpteron/2Gb RAM server with extrensive disk load. Every week or 
> >> two
> >> it suddenly hangs with "kmem_malloc(4096): kmem_map too small 335bla-bla 
> >> allocated". 
> >> I look onto handbook and put vm.kmem_size_max="536870912" onto 
> >> /boot/loader.conf.
> >> Today was the same with the new parameters. Is there any other solutions ?
> 
>  KK> If that's not enough, try making it larger.
> 
> On what size we can be sure, that "that's enough" ?

It depends on your workload, so keep increasing it until the problem
stops or you discover that you need more RAM to handle your workload.

Kris


pgpRpq2GKrEow.pgp
Description: PGP signature


Re: kmem_malloc(4096): kmem_map too small: 536870912 total allocated

2005-10-23 Thread Vladimir Sharun
Kris Kennaway wrote:
 KK> On Sun, Oct 23, 2005 at 10:43:42AM +0300, Vladimir Sharun wrote:
>> We have 2xOpteron/2Gb RAM server with extrensive disk load. Every week or two
>> it suddenly hangs with "kmem_malloc(4096): kmem_map too small 335bla-bla 
>> allocated". 
>> I look onto handbook and put vm.kmem_size_max="536870912" onto 
>> /boot/loader.conf.
>> Today was the same with the new parameters. Is there any other solutions ?

 KK> If that's not enough, try making it larger.

On what size we can be sure, that "that's enough" ?

-- 
UKR.NET Postmaster
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: kmem_malloc(4096): kmem_map too small: 536870912 total allocated

2005-10-23 Thread Kris Kennaway
On Sun, Oct 23, 2005 at 10:43:42AM +0300, Vladimir Sharun wrote:
> We have 2xOpteron/2Gb RAM server with extrensive disk load. Every week or two
> it suddenly hangs with "kmem_malloc(4096): kmem_map too small 335bla-bla 
> allocated". 
> I look onto handbook and put vm.kmem_size_max="536870912" onto 
> /boot/loader.conf.
> Today was the same with the new parameters. Is there any other solutions ?

If that's not enough, try making it larger.

Kris


pgp2i0Ml6rdux.pgp
Description: PGP signature


kmem_malloc(4096): kmem_map too small: 536870912 total allocated

2005-10-23 Thread Vladimir Sharun
We have 2xOpteron/2Gb RAM server with extrensive disk load. Every week or two
it suddenly hangs with "kmem_malloc(4096): kmem_map too small 335bla-bla 
allocated". 
I look onto handbook and put vm.kmem_size_max="536870912" onto 
/boot/loader.conf.
Today was the same with the new parameters. Is there any other solutions ?

# sysctl -a | grep kmem
vm.kmem_size: 536870912
vm.kmem_size_max: 536870912
vm.kmem_size_scale: 3

The only vm.kmem_size_max on loader.conf, no vm.kmem_size.

We're running FreeBSD 6.0-BETA5 #0: Wed Sep 28 16:54:33 EEST 2005
in i386 mode. The same was with 5.3/5.4 and NetBSD 2.0 on this machine.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"