I found some old messages talking about "Memory Leak" in AOLserver 3 (I think I'm running to the same problem as far as memory and slowness issues we have right now).
 
According to answers, the source of the problem is TCL 8.0 and the solution is to upgrade TCL library to 8.3.1. http://listserv.aol.com/cgi-bin/wa?A2=ind0008&L=aolserver&D=0&I=-3&X=67CBE07276211DD16C&[EMAIL PROTECTED]&P=1878 (and solution from Kriston : http://listserv.aol.com/cgi-bin/wa?A2=ind0008&L=aolserver&D=0&I=-3&X=0C65431B8EB0007184&[EMAIL PROTECTED]&P=3300 )
 
Would some please be kind enough and assist me how to only upgrade my TCL to 8.3.1 from my AOLserver/3.3.1+ad13 w/TCL 8.3 ??
 
Thank you,
Seena
-----Original Message-----
From: Seena Kasmai [mailto:[EMAIL PROTECTED]]
Sent: Thursday, January 30, 2003 11:09 AM
To: [EMAIL PROTECTED]
Subject: Re: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung - Memeory problem

Hello Again,

 

Finally I put exception handling (catch) after the ns_mutex lock, all across the application to make sure we are unlocking the mutex. But again after running some traffic to the web server, the requests to the page that actually calls the ns_mutex, started to getting stuck and eventually server locked up.

 

Then I suspect that maybe we are locking that mutex simultaneously (between the 2 procs) and somehow it creates a conflict. So after removing the lock for the proc that increment the array, I could never lock the server!! So it looks like we have some sort of conflict when locking the same mutex, although I assume the locks should go the a queue sort of thing and the unlocking should act in the order. I wrote a test page to only lock a mutex (and not unlock). I run this page, 10 times, all of the requests get stuck in the queue, then I run a unlock mutex, and every time I run, the first request in the queue gets releases, so the functionality seems to be working but still don't know why in that case server gets into trouble.

 

Another issue that might be related (or may be not), is that I have noticed, while the AOLServer is running, the memory keeps getting shrink and eventually system runs out of memory and web serve dies. Initially when AOLServer comes up, system has about 840MB memory. So far in about every 24-hour period, the memory becomes under 16MB and eventually server crashes (and memory gets back to 875MB). Here is a snap shot of TOP when server starts up:

 

 CPU states:  100% idle,  0.0% user,  0.0% kernel,  0.0% iowait,  0.0% swap

Memory: 1024M real, 829M free, 58M swap in use, 4809M swap free

 

   PID USERNAME THR PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND

 27834 nsadmin    8  59    0   52M   47M sleep    0:45  0.02% nsd8x

 

The only thing that can use memory a lot while traffic is running on the site, is that our application uses Memoize a lot, which caches the result of database queries in a list of list format. but I saw the server was eating 1MB memory per second (according to "top") even when nothing was going on the server !

 

Again please not that the same code/application and setup is running fine with AOL version 2.3.3 / TCL 7, so I can't think of any nasty bug or a infinite loop that can be exist. I've been closely looking at the error logs and there is no Error. Any comment or idea that anyone may have to point out why the new version is acting differently in this situation, is greatly appreciated.

 

BTW, here is the configuration file : (should I have attached it !? )
 
#
# Translated on Thu Jan 16 02:58:05 EST 2003
# from .ini format with translate-ini
#
# config file for a Netra farm box
 
ns_section ns/db/pools
ns_param main main
ns_param subquery subquery
ns_param secondary secondary
 param secondary_subquery secondary_subquery
ns_param log log
ns_param clickstream clickstream
ns_param search search
 
ns_section ns/db/drivers
#ora8=ora8.2.0.1-816-.so
ns_param ora8 /home/nsadmin/bin/ora8.so
 
ns_section ns/db/pool/main
ns_param Driver ora8
ns_param Connections 6
ns_param DataSource ora8_tcp
ns_param User
ns_param Password
ns_param Verbose On
ns_param ExtendedTableInfo On
ns_param LogSQLErrors On
 
ns_section ns/db/pool/subquery
ns_param Driver ora8
ns_param Connections 6
ns_param DataSource ora8_tcp
ns_param User
ns_param Password
ns_param Verbose On
ns_param ExtendedTableInfo On
ns_param LogSQLErrors On
 
ns_section ns/db/pool/secondary
ns_param Driver ora8
ns_param Connections 6
ns_param DataSource testds
#DataSource=ora8_tcp
ns_param User
ns_param Password
ns_param Verbose on
ns_param ExtendedTableInfo On
ns_param LogSQLErrors On
 
ns_section ns/db/pool/secondary_subquery
ns_param Driver ora8
ns_param Connections 6
ns_param DataSource testds
#DataSource=ora8_tcp
ns_param User
ns_param Password
ns_param Verbose on
ns_param ExtendedTableInfo On
ns_param LogSQLErrors On
 
ns_section ns/db/pool/log
ns_param Driver ora8
ns_param Connections 2
ns_param DataSource ora8_tcp
ns_param User
ns_param Password
ns_param Verbose On
ns_param ExtendedTableInfo On
ns_param LogSQLErrors On
 
ns_section ns/db/pool/clickstream
ns_param Driver ora8
ns_param Connections 2
ns_param DataSource ora8_tcp
ns_param User clickstream
ns_param Password clickrules
ns_param Verbose On
ns_param ExtendedTableInfo On
ns_param LogSQLErrors On
 
ns_section ns/db/pool/search
ns_param Driver ora8
ns_param Connections 3
ns_param DataSource ora8_tcp
# DataSource=newaway
ns_param User
ns_param Password
ns_param Verbose On
ns_param ExtendedTableInfo On
ns_param LogSQLErrors On
 

ns_section ns/db/driver/ora8
ns_param maxStringLogLength 8196
 
ns_section ns/parameters
ns_param User nsadmin
ns_param Home /home/nsadmin
ns_param Debug off
ns_param ServerLog /home/away-logs/away-n5-error.log
ns_param StackSize 500000
ns_param umask 002
ns_param catchExceptions off
ns_param auxconfigdir /web/away/parameters
 
ns_section ns/server/away
ns_param PageRoot /web/away/www
ns_param DirectoryFile {index.tcl, index.adp, index.html, index.htm}
ns_param Webmaster
ns_param NoticeBgColor #ffffff
ns_param EnableTclPages On
ns_param NotFoundResponse /global/file-not-found.html
ns_param ServerBusyResponse /global/server-busy.html
ns_param ServerInternalErrorResponse /global/error.html
ns_param MaxThreads 40
#MaxBusyThreads=20
ns_param MaxBusyThreads 0
ns_param MaxWait 15
ns_param DirectoryListing none
ns_param checkstats on
ns_param checkStatsInterval 60
ns_param Fancy On
 
ns_section ns/server/away/adp/parsers
ns_param "fancy" ".tcl"
ns_param "fancy" ".adp"
 
ns_section ns/server/away/db
ns_param Pools *
ns_param DefaultPool main
 
ns_section ns/server/away/cgi
ns_param Map {GET /*.cgi}
ns_param Map {POST /*.cgi}
 
ns_section ns/server/away/adp
ns_param Map /*.adp
ns_param Map /*.help
ns_param Map /*.js
ns_param Map /*.asp
ns_param Map /*.htm
ns_param Map /*.html
ns_param "DefaultParser" "fancy"
 
ns_section ns/server/away/module/nslog
ns_param enablehostnamelookup Off
ns_param file /home/away-logs/away-n5.log
ns_param maxbackup 10
ns_param rollday *
ns_param rollfmt %Y-%m-%d-%H.%M
ns_param rollhour 0
ns_param rollonsignal On
ns_param rolllog On
ns_param ExtendedHeaders Referer,User-Agent,Host,Cookie
 
ns_section ns/server/away/module/nsperm
ns_param model Small
ns_param enablehostnamelookup Off
 
ns_section ns/server/away/module/nssock
ns_param timeout 120
ns_param Address
#ns_param Port
ns_param Hostname
#Hostname=
 
ns_section ns/server/away/module/nssock_atb
ns_param timeout 120
#ns_param port 8084
ns_param Address
ns_param Hostname
 
ns_section ns/server/away/module/nsopenssl
ns_param ServerAddress
#ns_param Port 8085
ns_param ServerHostname
ns_param CertFile /home/nsadmin/servers/away/test-cert.pem
ns_param KeyFile /home/nsadmin/servers/away/test-key.pem
ns_param SockServerCertFile /home/nsadmin/servers/away/test-cert.pem
ns_param SockServerKeyFile /home/nsadmin/servers/away/test-key.pem
ns_param SockServerProtocols             "SSLv2, SSLv3, TLSv1"
ns_param SockServerCipherSuite           "ALL:!ADH:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP"
ns_param SockServerSessionCache          false
ns_param SockServerSessionCacheID        2
ns_param SockServerSessionCacheSize      512
ns_param SockServerSessionCacheTimeout   300
ns_param SockServerPeerVerify            true
ns_param SockServerPeerVerifyDepth       3
ns_param SockServerCADir                 internal_ca
ns_param SockServerCAFile                internal_ca.pem
ns_param SockServerTrace                 false
 

ns_section ns/server/away/MimeTypes
ns_param Default text/plain
ns_param NoExtension text/plain
ns_param .pcd image/x-photo-cd
ns_param .prc application/x-pilot
ns_param .css text/css
ns_param .doc application/msword
ns_param .rtf application/msword
ns_param .xls application/msexcel
ns_param .xlc application/msexcel
ns_param .fm4 application/x-framemaker
ns_param .fm5 application/x-framemaker
ns_param .ppt application/vnd.ms-powerpoint
ns_param .pot application/vnd.ms-powerpoint
ns_param .pps application/vnd.ms-powerpoint
ns_param .dvi application/x-dvi
 
ns_section ns/server/away/nscache
ns_param cacheADP on
 
ns_section ns/server/away/module/nscache/adp
ns_param dostat on
ns_param defaultexpires 3600
ns_param maxsize 100000000
 
ns_section ns/server/away/modules
ns_param nsperm nsperm.so
ns_param nslog nslog.so
ns_param nssock nssock.so
ns_param nsopenssl nsopenssl.so
ns_param nssock_atb nssock.so
#nsftp=nsftp.so
ns_param nscache nscache.so
 
ns_section ns/server/away/tcl
ns_param Library /web/away/tcl
 
ns_section ns/servers
ns_param away away
 
ns_section ns/setup
ns_param Enabled Off
ns_param Port 9799
ns_param Password t2o8WCGDYZddU
 
ns_section ns/threads
ns_param systemscope on
 
ns_section ns/server/away/acs
ns_param PrimaryServerP 0
ns_param ClickTestServerP 1
 
ns_section ns/server/away/acs/cs/logging
# is clickstreaming on?
ns_param EnabledP 0
# work with old systems - do all session management by ourselves?
ns_param LegacyP 1
# which pages are being served and should be logged?
# not used yet (or maybe ever)
ns_param PageExtensions {tcl, adp, html}
# how many user sessions before a user is not a new user?
ns_param NewUserThreshold 10
# logfile - ".%Y-%m-%d" is added to this
ns_param Logfile /home/click-logs/away-n2-cs.log
# archive template
ns_param ArchiveFile /home/nsadmin/log/away/old-cs-logs/away4-cs-log
 
 
 
-----Original Message-----
From: Seena Kasmai [mailto:[EMAIL PROTECTED]]
Sent: Monday, January 27, 2003 6:08 PM
To: [EMAIL PROTECTED]
Subject: Re: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung

Nathan - If you look at the code it does lock before attempting to any manipulation to that array.

#####################################
ns_share counter_A
ns_share counter_B
ns_share -init { set counter_mutex [ns_mutex create] } counter_mutex


proc X {i} {


ns_share counter_A
ns_share counter_B
ns_share counter_mutex

ns_mutex lock $counter_mutex
incr counter_A($i) 1
incr counter_B($i) 1

ns_mutex unlock $counter_mutex

}


proc_doc Y {} {


ns_share counter_A
ns_share counter_B
ns_share counter_mutex

ns_mutex lock $counter_mutex

foreach i_index [array names counter_A] {

set temp_counter_A($i_index) $counter_A($i_index)
set temp_counter_B($i_index) $counter_B($i_index)

unset counter_A($i_index)
unset counter_B($i_index)

}

ns_mutex unlock $counter_mutex

## writing $temp_counter_A and $temp_counter_B arrays to database

}

#####################################

-----Original Message-----
From: Nathan Folkman [mailto:[EMAIL PROTECTED]]
Sent: Monday, January 27, 2003 2:40 PM
To: [EMAIL PROTECTED]
Subject: Re: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung

In a message dated 1/27/2003 2:22:09 PM Eastern Standard Time, [EMAIL PROTECTED] writes:

Regarding the error handling in this code, as you see, the only thing is between the lock/unlock block is just incrementing the arrays, and also the database action takes places after unlocking. Since the existence of the arrays also is being tested and takes place before attempting to use ns_mutex, I'm assuming that no error could cause the ns_mutex unlock to be skipped because of an exception, plus nothing shows up in the error log either.


careful - you might have a race condition. consider this scenerio:

THREAD 1:
- check for existance of array(key)
- lock
- do something with array(key)
- unlock

THREAD 2:
- unset array(key)

thread 2 could unset your array after you've checked for its existance, and before you did something with it. to fix the scenerio above you'd need to lock around all access to your array and move the check for existance inside the lock as well:

THREAD 1:
- lock
- check for existance of array(key)
- do something with array(key)
- unlock

THREAD 2:
- lock
- unset array(key)
- unlock

better still is to catch and handle errors around code which acquires a mutex lock. this allows you to properly unlock and prevents dead lock situations where you've acquired a lock, an error occurs, and you never release the lock.

one other note about the nsv_incr command. in versions prior to 4.0 you need to first initialize the the nsv array and variable you are incrementing:

nsv_set myArray counter 0
nsv_incr myArray counter

in 4.0 the nsv_incr will create and initialize the array and variable if it doesn't already exist:

nsv_incr myArray counter

- nathan


Reply via email to