Well first of all, sorry for the lack of info. Here it is the (I think) relevant info on the servers. These two are identical machines, except for the RAM amount (the good one has 768 MB an the troubled one has 438 MB) OS: RedHat 8.0 Perl Version: 5.8.5, compiled from source in both cases. Kernel: 2.4.18-18SGI_XFS_1.2.0, from RPM, not recompiled. ok, now, some instructive things. I limit the number of spamd processes via the -m switch, in both servers. In fact, as I said, they're pretty much the same machine, except for the RAM. But when I do a "top" on the good-running one, I get this (this one is able to run SA 3.0.1, and has not complained of anything, in fact, I think it's running it faster than it did with 3.0.0): 16:05:02 up 12 days, 6:15, 8 users, load average: 0,42, 0,40, 0,37 165 processes: 159 sleeping, 5 running, 1 zombie, 0 stopped CPU states: 22,0% user 10,9% system 51,6% nice 0,0% iowait 15,3% idle Mem: 772980k av, 760400k used, 12580k free, 0k shrd, 4k buff 388184k actv, 177588k in_d, 67820k in_c Swap: 524600k av, 19636k used, 504964k free 281572k cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 5687 root 14 -1 106M 42M 11392 S < 0,0 5,6 46:13 0 X 13284 root 15 0 37884 36M 19724 S 0,0 4,9 2:37 0 galeon-bin 12997 root 15 0 29764 29M 11124 R 0,0 3,8 1:31 0 opera 13749 root 15 0 23804 23M 9808 S 0,0 3,0 0:03 0 java_vm 29542 spamd 25 0 23212 20M 9496 S 19,3 2,6 16:49 0 spamd 5809 root 15 0 13568 13M 11672 S 0,0 1,7 0:21 0 kdeinit 5804 root 15 0 13484 13M 12064 R 0,0 1,7 0:13 0 kdeinit 12182 root 15 0 12432 12M 10676 S 0,0 1,6 0:05 0 kdeinit 5822 root 15 0 11468 11M 10240 S 0,0 1,4 0:07 0 kdeinit 5791 root 15 0 11244 10M 10160 S 0,0 1,4 0:00 0 kdeinit 5801 root 15 0 11100 10M 10036 S 0,0 1,4 0:22 0 kdeinit 7591 root 15 0 11064 10M 10108 S 0,0 1,4 0:00 0 kdeinit 5817 root 15 0 10172 9,9M 9348 R 0,0 1,3 0:01 0 kdeinit 5819 root 15 0 9944 9940 9260 S 0,0 1,2 0:00 0 kdeinit 5764 root 15 0 9564 9560 8880 S 0,0 1,2 0:06 0 kdeinit 5800 root 15 0 9312 9308 8640 S 0,0 1,2 0:00 0 kdeinit Despite the fact that a colleague of mine is working over this server, most of the time I do only see a single process shown in the TOP, when I sort them by memory usage. On the other hand, when I was running 3.0.1 in the faulty machine (had to switch back to 2.64, or my boss would kill me), if a did a top, then sort the porcesses by memory usage, the spamd processes were always the firsts ones, starting at 22 MB and getting bigger and bigger. If I increase the number of childs allowed to run, it takes it a little longer to start growing. But here's a test I did: according to the man pages, --max-conn-per-child should cause the child to die when that number of connections is reached, but this didn't happen. I even tried putting a ridicoulous number here (reached even to three), but still the child processes didn't die. They seemed to be determined to grow bigger and bigger (reminds me of a virus... Nevermind, just a little bit fun). Anyway, I got clueless... Luis On Tue, 26 Oct 2004 11:17:11 -0700, Justin Mason <[EMAIL PROTECTED]> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Matt Kettler writes: > > At 01:49 PM 10/26/2004, Luis Hernán Otegui wrote: > > >What really pisses me off is the fact that in the rest of my servers, > > >when SA starts, only the parent process weights 22 MB, the children > > >weight approx. 5 MB each. But in this particular server, all of the > > >spamd processes start up as 22 MB processes... > > > > It strikes me as rather odd that the size of them is different. In theory > > they should all be the same. > > > > What kernels are the boxes using? > > > > One theory I have is that the boxes with 5mb children has a RCU enabled > > kernel, thus the children are 5mb of their own memory, and the rest is > > shared with the map of the parent. (RCU causes forked children to share > > pages of memory with the parent until they modify the page, then it gets > > reallocated) > > > > On the one box which has 22mb children, I suspect there's no RCU support, > > so the whole 22mb parent is copied at the time of fork(). > > > > In linux, RCU is present on 2.6.x kernels, although some vendors may have > > backported it to their 2.4x kernels > > nah, that's Copy-On-Write you're thinking of, which has been std in linux > and most UNIX kernels since 2.2.x ;) Every 2.4.x and 2.6.x kernel will > do this just fine -- although 2.6.x and Red Hat 2.4.x kernels report their > statistics incorrectly, making it seem to *not* be working. > > I think Luis means that the parent is 22mb, and the children share most of > that memory except for 5mb of their own VSZ, which is about right. > > - --j. > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.2.4 (GNU/Linux) > Comment: Exmh CVS > > iD8DBQFBfpSnMJF5cimLx9ARAtbnAKCMREGoWwGCMFtxF/p2E3rTTF/2VACgmaHo > qGleOmvPexQ6uCbmgFsflbE= > =OKIU > -----END PGP SIGNATURE----- > > -- ------------------------------------------------- GNU-GPL: "May The Source Be With You... -------------------------------------------------