Title: RE: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung - Memory problem
You are right Andrew, we are using ACS and I believe the version is 2.2.3. Now the info tclversion says 8.3, but the info patchlevel says 8.3.2, also the directory is aolserver/lib/tcl8.3/, so not sure what is running right now.
I've been digging into the application but since everything is happy and no Error is happening I have no idea what can cause this. We have a lot of tracing and logging in the critical sections and so forth but as I said nothing shows up when the webserevr starts eating all the memory.
I haven't exactly found a pattern where I can create the problem, but basically if we start clicking on the pages for 10 minutes (load level ~2.5), then the problem shows up. But that doesn't tell anything because there might be a specific section that needs to be hit in order to create the memory problem. Now last night I tried to use some of our Admin pages which heavily touched data base and involves TCL usage a lot, the free memory dropped 30MB (which might be normal), and now after 12 hours or so, still is in the same usage, so I think it has something to do with the load and amount of traffic.
Would using -z (zippy memory allocator switch) help to do more tracing/monitoring ?
We use ns_share massively, could that be the cause ?
Thanks,
Seena
P.S as far as memory leak subject, so should I ignore the discussion I've found which I though it's similar to my problem ? Could you access the messages ? (the links I provided was broken I think, sorry about that)
Here is what Kris had said for the solution which seemed to work, and I ahev attached couple of emails that present the same issue.
---
On the subject of memory leaks, there is a known symptom of nsd8x
where it can grow without bound in certain circumstances. We do not
yet know the cause, but it appears to be endemic to Tcl 8.3.0. If you
use nsd76 the problem completely disappears.
Kris
-
The next release of AOLserver (which we'll be releasing very soon) has Tcl
8.3.1 which appears to have cleared up the memory leak. It does/will have a
range-checking memory allocator, too. If you have CVS access, you can use it
right now (as of 8/8/2000, in fact).
As far as an official comment, AOLserver is an open-source product.
Anyone with the means and the skill can help debug the server. I fail to
understand how a suggestion to move to nsd76 to solve an evident memory leak
in Tcl 8.3.0 equates to moving to IIS, as one writer on this mailing list
so eloquently put it.
Now, as for nsd76 growing without bound: that is news to AOL Digital City.
They run nsd76 in production on some of the busiest systems in the world and
we have yet to see a memory leak in the core AOLserver 3.0 (it's always been
in various C modules we load for our applications).
It's also important to understand the difference between RSS and SZ. The
RSS, or resident set size, is the amount of core memory being used by a
process. The SZ is the total amount of core memory plus virtual memory being
used. As any Unix administrator or developer can tell you, it is perfectly
normal and acceptable for a process to have a bigger SZ than RSS due to the
simple fact that not all data in a process' address space is used all the
time. This is very dependent on the flavor of Unix -- different systems have
different algorithms that decide when to write pages to swap. If you'd like
to read a fairly simple explanation of this, visit
http://www.freebsd.org/FAQ/misc.html, the book Operating System Concepts,
3e (Silberschatz/Peterson/Galvin), Unix Internals (Valhalia), and of
course the Tanenbaum book.
Finally, about Purify. We have access to the very latest versions of
Purify. Unfortunately, Purify dumps core when encountering such innocuous
messages as UMR. We are working on getting this issue resolved and using
Purify on Irix in the meantime, and haven't found much to suggest a problem
exists in nsd76 (though we deferred testing nsd8x until Tcl 8.3.1 is put
in).
I hope this message finds understanding readers.
Regards,
Kris
---
-Original Message-
From: Andrew Piskorski [mailto:[EMAIL PROTECTED]]
Sent: Friday, January 31, 2003 2:19 AM
To: [EMAIL PROTECTED]
Subject: Re: [AOLSERVER] ns_mutex is likely causing our AOL web server
to hung - Memory problem
On Thu, Jan 30, 2003 at 09:41:27PM -0500, Seena Kasmai wrote:
With 2.3.3 we use ACS and we use Oracle. Everything in the application seems
We sort of have our own version of ACS (we have added/modified it), given
it's functioning with 3.3.1, is it possible to upgrade to 3.5.1 w/ TCL 8.4 ?
Seena, since your email address is @away.com, I figured you must be
using some flavor of ACS. But, exactly which version of the ACS was
your software based on originally? 3.4, 3.2, maybe even 2.x?