On Thursday, January 30, 2003, at 09:19 PM, Seena Kasmai wrote:

Well, the strange thing is we never see such a behavior on 2.3.3 w/TCL 7.
0, and we run 4 web server with the same code/application. That's why I
can't think of any code related issue.
It's been a long time since I've used 2.3.3, but I can't help but think
that there are some functions in 2.3.3 that are not compatible with 3.x,
so I don't think it's possible to pick up a 2.3.3 app (which was Tcl 7.6,
not Tcl 7.0) and run it directly on 3.x without some modifications.  (Well,
 no significant application, anyway.  OK, I'm sure there's a
counterexample out there somewhere.)

I did check the size of the cache array we use for Memoizing stuff, and
it's not that big at the time server is eating the memory. We were able
to re-create the problem in 20 Minutes just by clicking on various pages
(including TCL pages) and after we stop clicking the memory was kept
getting eaten like 2-3MB per seconds and then it stops for a while and
the starts again (while no activity), until it gets down to 16MB, and
then it uses the max swap file allowed until it dies.
That memory is going somewhere.  Perhaps not into the memoize cache; I
only pointed out that one because you identified it in your message.  I
would start generously sprinkling ns_log statements through one of the
execution paths taken by one of the pages you've identified, including
filters and traces.  One possibility is that some function call you made
under 2.3.3 is now failing, and the application is retrying the operation,
 which could cause a lot of activity, since the retries will not fail.

Is there database activity going on?  Perhaps if you turn on verbose SQL
logging, you'll see a pattern of queries that could point you to the
problem.

Anyhow, would you recommend to upgrade to 3.4.2 or 3.5.1 w/ TCL 8.3.1 ?
If you are using ACS and Oracle, or OpenACS, you must use a version of
AOLserver with arsDigita patches.  If you can upgrade, meaning that you
don't use any ACS stuff nor Oracle, then you want to use 3.5.1, and not 3.
4.2.  The 3.5.1 release will allow you to use Tcl 8.4, which is faster,
among other things, but the main thing is that with 3.5.1, if there's a
Tcl update, you can update Tcl without updating AOLserver.  So, if you do
not use ACS or OpenACS, nor Oracle, I suggest upgrading to AOLserver 3.5.1.

Again, given the pathological behavior you're reporting, I strongly doubt
the problem is something as subtle as a bug in Tcl.  I think such a bug
would not manifest itself so dramatically, unless it segfaulted
immediately.

Pete.

Reply via email to