Re: [AOLSERVER] AS 4.0 beta 1 : 2 problems
Regarding #2, did you change the config file to load the nsdb module in ns_section ns/server/$server/modules ? Jamie Jean-Fabrice RABAUTE wrote: Hi all, I just downloaded AS 4.0 beta 1 this week-end to make some tests. Here are the 2 problems I get with this new version : 1/ Unable to compile under Solaris 2.7 ('uname' says SunOS 5.7). It looks like the 'configure' command can't get the 'poll' info (I think it's the same bug as 663846 http://sourceforge.net/tracker/index.php?func=detailaid=663846group_id=315 2atid=103152 ) Changing to -DHAVE_POLL=1, and doing some minor changes (for modules I need to rename OBJS in MODOBJS in each module Makefile for 'make') I could compile. 2/ Launching the server using a 'config.tcl' I got from my actual 3.2 server I get 1 error an 1 strange behavior: The error is that the oracle driver is not loaded ! Do I have to add the oracle driver in the 'ns/servers/server-name/modules' list ? The strange behavior is the bug 667651. If anybody have some solution, mainly for the oracle DB driver load, I will appreciate any input ! Thanks a lot. Best regards. Jean-Fabrice RABAUTE Core Services http://www.core-services.fr Mob: +33 (0)6 13 82 67 67
Re: [AOLSERVER] AS 4.0 beta 1 : 2 problems
Several folks are having issues with the configure and poll. The default compiler in the configure script is 'cc', which fails for a lot of environments, and the configure erroneously determines that poll is not available (which causes later problems in the compile.) You may want to try this workaround: setenv CC gcc configure options (You may want to unsetenv CC before running your make, though). I was able to compile on 2.7 with this workaround. We are changing the configure script to default to gcc instead (since this is the 'compiler of choice' for AOLserver) Elizabeth Thomas Principal Software Engineer America Online, Inc. [EMAIL PROTECTED] wrote: Hi all, I just downloaded AS 4.0 beta 1 this week-end to make some tests. Here are the 2 problems I get with this new version : 1/ Unable to compile under Solaris 2.7 ('uname' says SunOS 5.7). It looks like the 'configure' command can't get the 'poll' info (I think it's the same bug as 663846 http://sourceforge.net/tracker/index.php?func=detailaid=663846group_id=315 2atid=103152 ) Changing to -DHAVE_POLL=1, and doing some minor changes (for modules I need to rename OBJS in MODOBJS in each module Makefile for 'make') I could compile.
Re: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung
Hey Nathan! Hereis the simplified version of the code which shows how we are using ns_mutex in our application. Basically the proc A, is being called a lot( more than 100 times in a minute) across the applications, and proc B is scheduled to run every ~5 minutes. Here the primary reason for using ns_mutexis to protect counters' valueswhile it's being manipulated ( incremented/written/cleared) from being accessed by other threads. Please feel free to criticize this code as much as you can! Again we are seeing that AOLserver 3.3.1 gets into trouble after calling this procsheavily (eventually the servergoes down). By onlytaking out the ns_mutex lines, we'll have no problem!. Previously we never had any problemrunning these on Version 2.3.3. In the meanwhile regarding thens_share, what is the major issue with it that people encourage not to use it ? Thanks! --Seena # ns_share counter_A ns_share counter_B ns_share -init { set counter_mutex [ns_mutex create] }counter_mutex proc X {i} { ns_share counter_Ans_share counter_Bns_share counter_mutex ns_mutex lock $counter_mutex incr counter_A($i) 1incr counter_B($i) 1ns_mutex unlock $counter_mutex } proc_doc Y {} { ns_share counter_Ans_share counter_Bns_share counter_mutex ns_mutex lock $counter_mutexforeach i_index [array names counter_A]{set temp_counter_A($i_index) $conter_A($i_index)set temp_counter_B($i_index) $conter_B($i_index)unset $conter_A($i_index)unset $conter_B($i_index)} ns_mutex unlock $counter_mutex## writing $temp_counter_Aand $temp_counter_B arrays to database} # -Original Message-From: Nathan Folkman [mailto:[EMAIL PROTECTED]]Sent: Friday, January 24, 2003 7:08 PMTo: [EMAIL PROTECTED]Subject: Re: [AOLSERVER] ns_mutex lock / unlock is likely causing our AOL webserver to...In a message dated 1/24/2003 4:47:20 PM Eastern Standard Time, [EMAIL PROTECTED] writes: Any more inputs regarding this matter will greatly be appreciated. Any chance you could provide a few snippets of code showing where you are locking and unlocking, and the work you are doing in between? Hard to tell what the problem is. If I had to guess, however, it sounds like you are dead locked. Perhaps you are locking, and throwing an un-caught error, and never unlocking? Or maybe you are just experiencing contention around your database which is causing other requests to back up waiting for that resource... If you can provide some more detailed information, including anythng odd you see in the server log that would be great! Also might want to check the SYSLOG for any database errors which could point to the problem.Also, have you considered upgrading to at least AOLserver 3.4.2 or even better 3.5.1? Would need more information to know exactly what you are trying to do, but you might be able to use the nsv_incr command for your counters. The nsv data structure is similiar to ns_share variables in that you can share variables between multiple threads/interps. The nsv implementation is a lot cleaner, and handles all the synchronization for you. Plus, as I mentioned before, there's a nifty nsv_incr command specifically for things like counters. ns_share is not recommended, especially when running Tcl 8.x.- Nathan --- Thanks Andrew for your input. We use Solaris as well and the AOLserver seems to work fine in any other situations except when ns_mutex comes to play. Here is more details how we are using it. We use ns_mutex inside a scheduled proc, which writes a cashed array of numbers (counters) to the database. This proc is scheduled for every 5 minutes, to lock that array - so that no other process can manipulate that array at the moment it's being written to db - writes the numbers to db, resets the counters, and then unlock that array using ns_mutex unlock. Notice that this array is ns_share`ed. While everything seems to function and be happy, after the webserver gets more traffic, then we'll start seeing that all the process that have attempted to access that array, are waiting in the queue. At this stage the nsd process will take most of the CPU usage and the webserver almost doesn't respond the http requests. If we stop the traffic eventually (sometimes after a long time) the server will come back up to a normal operation and the queue will become empty. I modified that scheduled proc only to not lock that array (no ns_mutex use), and after making this change, webserver never got in to trouble. That's why I'm almost certain that ns_mutex is causing problems. I suspect maybe combination of ns_share and ns_mutex on that array might be the cause of this. I also noticed doing "upvar" on a ns_shared variable doesn't work ! Any more inputs regarding this matter will greatly be appreciated. Thanks Seena -Original
Re: [AOLSERVER] AS 4.0 beta 1 : 2 problems
Thanks ! I is working with several warning, but working. Best regards. Jean-Fabrice RABAUTE Core Services http://www.core-services.fr Mob: +33 (0)6 13 82 67 67 -Message d'origine- De : AOLserver Discussion [mailto:[EMAIL PROTECTED]]De la part de Elizabeth Thomas Envoye : lundi 27 janvier 2003 17:10 A : [EMAIL PROTECTED] Objet : Re: [AOLSERVER] AS 4.0 beta 1 : 2 problems Several folks are having issues with the configure and poll. The default compiler in the configure script is 'cc', which fails for a lot of environments, and the configure erroneously determines that poll is not available (which causes later problems in the compile.) You may want to try this workaround: setenv CC gcc configure options (You may want to unsetenv CC before running your make, though). I was able to compile on 2.7 with this workaround. We are changing the configure script to default to gcc instead (since this is the 'compiler of choice' for AOLserver) Elizabeth Thomas Principal Software Engineer America Online, Inc. [EMAIL PROTECTED] wrote: Hi all, I just downloaded AS 4.0 beta 1 this week-end to make some tests. Here are the 2 problems I get with this new version : 1/ Unable to compile under Solaris 2.7 ('uname' says SunOS 5.7). It looks like the 'configure' command can't get the 'poll' info (I think it's the same bug as 663846 http://sourceforge.net/tracker/index.php?func=detailaid=663846group_id=31 5 2atid=103152 ) Changing to -DHAVE_POLL=1, and doing some minor changes (for modules I need to rename OBJS in MODOBJS in each module Makefile for 'make') I could compile.
Re: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung
Put catches around your locked code and you may find a bug, for example, incrementing an array var that doesn't exist or unsetting an array var that doesn't exist. Without ns_mutex calls, the code may blow up but your server won't lock up. Jim This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. --_=_NextPart_001_01C2C61E.F6AB3890 Content-Type: text/plain Hey Nathan! Here is the simplified version of the code which shows how we are using ns_mutex in our application. Basically the proc A, is being called a lot ( more than 100 times in a minute) across the applications, and proc B is scheduled to run every ~5 minutes. Here the primary reason for using ns_mutex is to protect counters' values while it's being manipulated ( incremented/written/cleared) from being accessed by other threads. Please feel free to criticize this code as much as you can! Again we are seeing that AOLserver 3.3.1 gets into trouble after calling this procs heavily (eventually the server goes down). By only taking out the ns_mutex lines, we'll have no problem!. Previously we never had any problem running these on Version 2.3.3. In the meanwhile regarding the ns_share, what is the major issue with it that people encourage not to use it ? Thanks! --Seena # ns_share counter_A ns_share counter_B ns_share -init { set counter_mutex [ns_mutex create] } counter_mutex proc X {i} { ns_share counter_A ns_share counter_B ns_share counter_mutex ns_mutex lock $counter_mutex incr counter_A($i) 1 incr counter_B($i) 1 ns_mutex unlock $counter_mutex } proc_doc Y {} { ns_share counter_A ns_share counter_B ns_share counter_mutex ns_mutex lock $counter_mutex foreach i_index [array names counter_A] { set temp_counter_A($i_index) $conter_A($i_index) set temp_counter_B($i_index) $conter_B($i_index) unset $conter_A($i_index) unset $conter_B($i_index) } ns_mutex unlock $counter_mutex ## writing $temp_counter_A and $temp_counter_B arrays to database } # -Original Message- From: Nathan Folkman [mailto:[EMAIL PROTECTED]] Sent: Friday, January 24, 2003 7:08 PM To: [EMAIL PROTECTED] Subject: Re: [AOLSERVER] ns_mutex lock / unlock is likely causing our AOL webserver to... In a message dated 1/24/2003 4:47:20 PM Eastern Standard Time, [EMAIL PROTECTED] writes: Any more inputs regarding this matter will greatly be appreciated. Any chance you could provide a few snippets of code showing where you are locking and unlocking, and the work you are doing in between? Hard to tell what the problem is. If I had to guess, however, it sounds like you are dead locked. Perhaps you are locking, and throwing an un-caught error, and never unlocking? Or maybe you are just experiencing contention around your database which is causing other requests to back up waiting for that resource... If you can provide some more detailed information, including anythng odd you see in the server log that would be great! Also might want to check the SYSLOG for any database errors which could point to the problem. Also, have you considered upgrading to at least AOLserver 3.4.2 or even better 3.5.1? Would need more information to know exactly what you are trying to do, but you might be able to use the nsv_incr command for your counters. The nsv data structure is similiar to ns_share variables in that you can share variables between multiple threads/interps. The nsv implementation is a lot cleaner, and handles all the synchronization for you. Plus, as I mentioned before, there's a nifty nsv_incr command specifically for things like counters. ns_share is not recommended, especially when running Tcl 8.x. - Nathan --- Thanks Andrew for your input. We use Solaris as well and the AOLserver seems to work fine in any other situations except when ns_mutex comes to play. Here is more details how we are using it. We use ns_mutex inside a scheduled proc, which writes a cashed array of numbers (counters) to the database. This proc is scheduled for every 5 minutes, to lock that array - so that no other process can manipulate that array at the moment it's being written to db - writes the numbers to db, resets the counters, and then unlock that array using ns_mutex unlock. Notice that this array is ns_share`ed. While everything seems to function and be happy, after the webserver gets more traffic, then we'll start seeing that all the process that have attempted to access that array, are waiting in the queue. At this stage the nsd process will take most of the CPU usage and the webserver almost doesn't respond the http requests. If we stop the traffic eventually (sometimes after a long time) the server will come back up to a normal operation and
Re: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung
In a message dated 1/27/2003 11:21:51 AM Eastern Standard Time, [EMAIL PROTECTED] writes: proc_doc Y {} { ns_share counter_A ns_share counter_B ns_share counter_mutex ns_mutex lock $counter_mutex foreach i_index [array names counter_A] { set temp_counter_A($i_index) $conter_A($i_index) set temp_counter_B($i_index) $conter_B($i_index) unset $conter_A($i_index) unset $conter_B($i_index) } Is the above the actual snippet from the code? If so, my guess is the typos ($conter_A, $conter_B instead of $counter_A $counter_B) are throwing errors and the mutex is not getting freed, causing deadlocking in your app. Removing the ns_mutex lines from the code wouldn't fix the errors, but the deadlocks would not occur. Are there any errors in your server log during the times the proc_doc Y runs? ~Rich ___ R i c h F r e d e r i c k s Software Engineer AOL Web Services Publishing p: 703.265.0364 e: [EMAIL PROTECTED]
Re: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung
Title: RE: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung Good point, but there are logics before these lines (I have take out, the actual code is couple pages but is pre-processing stuff and error checking) that takes care of the errors. I don't think exception/error is the case, specially because the same code has been working for years in a very high traffic website. Sorry, it looks like I've had typos in the sample code, but you see the point anyway. Here is the code again: # ns_share counter_A ns_share counter_B ns_share -init { set counter_mutex [ns_mutex create] } counter_mutex proc X {i} { ns_share counter_A ns_share counter_B ns_share counter_mutex ns_mutex lock $counter_mutex incr counter_A($i) 1 incr counter_B($i) 1 ns_mutex unlock $counter_mutex } proc_doc Y {} { ns_share counter_A ns_share counter_B ns_share counter_mutex ns_mutex lock $counter_mutex foreach i_index [array names counter_A] { set temp_counter_A($i_index) $counter_A($i_index) set temp_counter_B($i_index) $counter_B($i_index) unset $counter_A($i_index) unset $counter_B($i_index) } ns_mutex unlock $counter_mutex ## writing $temp_counter_A and $temp_counter_B arrays to database } # -Original Message- From: Jim Wilcoxson [mailto:[EMAIL PROTECTED]] Sent: Monday, January 27, 2003 11:41 AM To: [EMAIL PROTECTED] Subject: Re: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung Put catches around your locked code and you may find a bug, for example, incrementing an array var that doesn't exist or unsetting an array var that doesn't exist. Without ns_mutex calls, the code may blow up but your server won't lock up. Jim This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. --_=_NextPart_001_01C2C61E.F6AB3890 Content-Type: text/plain Hey Nathan! Here is the simplified version of the code which shows how we are using ns_mutex in our application. Basically the proc A, is being called a lot ( more than 100 times in a minute) across the applications, and proc B is scheduled to run every ~5 minutes. Here the primary reason for using ns_mutex is to protect counters' values while it's being manipulated ( incremented/written/cleared) from being accessed by other threads. Please feel free to criticize this code as much as you can! Again we are seeing that AOLserver 3.3.1 gets into trouble after calling this procs heavily (eventually the server goes down). By only taking out the ns_mutex lines, we'll have no problem!. Previously we never had any problem running these on Version 2.3.3. In the meanwhile regarding the ns_share, what is the major issue with it that people encourage not to use it ? Thanks! --Seena # ns_share counter_A ns_share counter_B ns_share -init { set counter_mutex [ns_mutex create] } counter_mutex proc X {i} { ns_share counter_A ns_share counter_B ns_share counter_mutex ns_mutex lock $counter_mutex incr counter_A($i) 1 incr counter_B($i) 1 ns_mutex unlock $counter_mutex } proc_doc Y {} { ns_share counter_A ns_share counter_B ns_share counter_mutex ns_mutex lock $counter_mutex foreach i_index [array names counter_A] { set temp_counter_A($i_index) $conter_A($i_index) set temp_counter_B($i_index) $conter_B($i_index) unset $conter_A($i_index) unset $conter_B($i_index) } ns_mutex unlock $counter_mutex ## writing $temp_counter_A and $temp_counter_B arrays to database } # -Original Message- From: Nathan Folkman [mailto:[EMAIL PROTECTED]] Sent: Friday, January 24, 2003 7:08 PM To: [EMAIL PROTECTED] Subject: Re: [AOLSERVER] ns_mutex lock / unlock is likely causing our AOL webserver to... In a message dated 1/24/2003 4:47:20 PM Eastern Standard Time, [EMAIL PROTECTED] writes: Any more inputs regarding this matter will greatly be appreciated. Any chance you could provide a few snippets of code showing where you are locking and unlocking, and the work you are doing in between? Hard to tell what the problem is. If I had to guess, however, it sounds like you are dead locked. Perhaps you are locking, and throwing an un-caught error, and never unlocking? Or maybe you are just experiencing contention around your database which is causing other requests to back up waiting for that resource... If you can provide some more detailed information, including anythng odd you see in the server log that would be great! Also might want to check the SYSLOG for any database errors which could point to the problem. Also, have you considered upgrading to at least AOLserver 3.4.2 or even better 3.5.1? Would need more information to know exactly what you are trying to do, but you might be able to use the nsv_incr command for your counters.
Re: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung
In a message dated 1/27/2003 10:43:28 AM Central Standard Time, [EMAIL PROTECTED] writes: foreach i_index [array names counter_A] { set temp_counter_A($i_index) $conter_A($i_index) set temp_counter_B($i_index) $conter_B($i_index) unset $conter_A($i_index) unset $conter_B($i_index) } Is the above the actual snippet from the code? If so, my guess is the typos ($conter_A, $conter_B instead of $counter_A $counter_B) are throwing errors and the mutex is not getting freed, causing deadlocking in your app. Removing the ns_mutex lines from the code wouldn't fix the errors, but the deadlocks would not occur. Are there any errors in your server log during the times the proc_doc Y runs? I'm guessing the code above is not the actual code, but in addition to Rich's comment, your "unset" lines should not have var substitution ($), but rather should just be the varname itself: unset conter_A($i_index) unset conter_B($i_index) ^-- should be no "$" -- michael __ michael richman princ software engineer aol infrastructure dev 214.442.6048
Re: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung
sorry, there is no $ sign in the actual code. So, is it worth trying to substitute ns_share with nvs stuff (nsv_set nsv_get) to see if the problem goes away ? Thanks, Seena -Original Message-From: Michael Richman [mailto:[EMAIL PROTECTED]]Sent: Monday, January 27, 2003 11:50 AMTo: [EMAIL PROTECTED]Subject: Re: [AOLSERVER] ns_mutex is likely causing our AOL web server to hungIn a message dated 1/27/2003 10:43:28 AM Central Standard Time, [EMAIL PROTECTED] writes: foreach i_index [array names counter_A] {set temp_counter_A($i_index) $conter_A($i_index)set temp_counter_B($i_index) $conter_B($i_index)unset $conter_A($i_index)unset $conter_B($i_index)}Is the above the actual snippet from the code? If so, my guess is the typos ($conter_A, $conter_B instead of $counter_A $counter_B) are throwing errors and the mutex is not getting freed, causing deadlocking in your app. Removing the ns_mutex lines from the code wouldn't fix the errors, but the deadlocks would not occur. Are there any errors in your server log during the times the proc_doc Y runs?I'm guessing the code above is not the actual code, but in addition to Rich's comment, your "unset" lines should not have var substitution ($), but rather should just be the varname itself:unset conter_A($i_index)unset conter_B($i_index) ^-- should be no "$"-- michael__michael richman princ software engineeraol infrastructure dev214.442.6048
Re: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung
On Mon, Jan 27, 2003 at 12:25:32PM -0500, Seena Kasmai wrote: So, is it worth trying to substitute ns_share with nvs stuff (nsv_set nsv_get) to see if the problem goes away ? Yes! With AOLserver 3.x or 4.x, you should always be using nsv instead of ns_share if you can. -- Andrew Piskorski [EMAIL PROTECTED] http://www.piskorski.com
[AOLSERVER] nsjava and aolserver 3.4.2
Hi! Is anyone using nsjava with aolserver 3.4+? I've compiled the module (after changing a few things in the makefiles). Now, when I'm starting up aolserver, it says: Error: startJvm: unable to get libjava.so library handle I followed the directions in the README file. I'm using SuSE 8.1 with Sun JDK 1.3.1 and aolserver 3.4.2 with the I18N patch. Thanks wiwo
Re: [AOLSERVER] nsjava and aolserver 3.4.2
Wolfgang Winkler wrote: Hi! Is anyone using nsjava with aolserver 3.4+? I've compiled the module (after changing a few things in the makefiles). What did you change in the makefiles? Now, when I'm starting up aolserver, it says: Error: startJvm: unable to get libjava.so library handle I followed the directions in the README file. I'm using SuSE 8.1 with Sun JDK 1.3.1 and aolserver 3.4.2 with the I18N patch. I haven't used aolserver 3.4.2, but I have used 3.5.1 and I've compiled it on Redhat 6.2, 7.3, and 8.0, and I also compiled it on debian 2.2. Go to the java home directory and run: find . -name *.so and send me the output. Also if you could send me the configure output that would be useful. Regards, Dan
Re: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung
+-- On Jan 27, Seena Kasmai said: sorry, there is no $ sign in the actual code. So, is it worth trying to substitute ns_share with nvs stuff (nsv_set nsv_get) to see if the problem goes away ? Your most effective action, if you want to maximize the utility of the advice from this mailing list, would be to create a test case that reproduces the problem, and post it in its entirety. Posting a simpified version of your problematic production code is not very helpful, because the simplified version is likely to omit whatever is causing the problem. That said, I doubt that using nsv_* instead of ns_share will help. Given your description of the problem, the most likely cause is that an error is occurring in a critical section (a section where the mutex is locked), preventing the Tcl interpreter from reaching the ns_mutex unlock command. You have not yet proved to us that this is not the case. So the next logical step (other than creating a test case) is to test the hypothesis that such is the case, by putting catch commands around your critical sections. For example, suppose the critical section looks like this: ns_mutex lock L SCRIPT ns_mutex unlock L Then you should change that to this: ns_mutex lock L set code [catch { SCRIPT } result] ns_mutex unlock L if {$code != 0} { return -code $code -errorinfo $::errorInfo \ -errorcode $::errorCode $result } (You'll need to use different variable names if you already have variables named code and result.) You can see that this guarantees that L will be unlocked, no matter what happens when SCRIPT is executed. Another approach would be to create a procedure like this: proc ns_mutex_eval {lock script} { ns_mutex lock $lock set code [catch {uplevel 1 $script} result] ns_mutex unlock $lock return -code $code -errorinfo $::errorInfo \ -errorcode $::errorCode $result } Then you would change the example critical section above to this: ns_mutex_eval L { SCRIPT } This way you don't have to worry about reusing the variable names code and result, and you don't have to repeat as much code at each critical section.
Re: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung
In a message dated 1/27/2003 12:32:51 PM Eastern Standard Time, [EMAIL PROTECTED] writes: sorry, there is no $ sign in the actual code. So, is it worth trying to substitute ns_share with nvs stuff (nsv_set nsv_get) to see if the problem goes away ? Thanks, Seena I would definitly recommend using nsv's instead of ns_share variables, especially if you are running Tcl 8.x. For your application, you'll probably want to take a look at the nsv_incr command specifically. Here's another tip which can help when dealing with lock contention. First, make sure you are creating named mutex locks: ns_mutex create counter Second, enable mutex metering in your AOLserver configuration. Be aware that this causes some additional lock contention itself, so I'd recommend only enabling this in a development environment: ns_section "ns/threads" ns_param mutexmeter on Lastly, with mutex metering enabled, you can use the "ns_info" command from the control port or an .adp page to find out what locks are causing the most contention. The latest AOLserver 3.5.x release contains a web based stats interface that displays this information. Here's a little script which essentially does the same thing: set results "NAME(ID): #LOCK, #BUSY, CONTENTION\n" foreach lock [ns_info locks] { foreach {name owner id nlock nbusy} $lock { if {$nbusy == 0} { set contention 0.0 } else { set contention [expr double($nbusy*100.0/$nlock)] } } append results "${name}(${id}): ${nlock}, ${nbusy}, ${contention}%\n" } return $results Hope this helps! - Nathan
Re: [AOLSERVER] nsjava and aolserver 3.4.2
Hi Dan! Taking a closer look at the configure output (below) it seems that my Java Environment isn't found properly, or am I missing something different? I had to change the INC direction in Makfile.global to: INC = -I/root/download/aolserver/3.4_patched/aolserver-3.4.2/include -I/root/download/aolserver/3.4_patched/aolserver-3.4.2/nsd -I/usr/lib/SunJava2-1.3.1/include/ -I/usr/lib/SunJava2-1.3.1/include/linux/ And I set NRPATH to -R $(Rpath) without the Xlinker directive in src/Makefile - A find -name *.so in my $JAVA_HOME directory outputs: ./jre/lib/i386/green_threads/libhpi.so ./jre/lib/i386/libnet.so ./jre/lib/i386/libverify.so ./jre/lib/i386/libioser12.so ./jre/lib/i386/libfontmanager.so ./jre/lib/i386/libzip.so ./jre/lib/i386/native_threads/libhpi.so ./jre/lib/i386/libhprof.so ./jre/lib/i386/libjsound.so ./jre/lib/i386/client/libjvm.so ./jre/lib/i386/libmlib_image.so ./jre/lib/i386/libJdbcOdbc.so ./jre/lib/i386/libjava.so ./jre/lib/i386/libjawt.so ./jre/lib/i386/libjcov.so ./jre/lib/i386/server/libjvm.so ./jre/lib/i386/libagent.so ./jre/lib/i386/libdcpr.so ./jre/lib/i386/libjpeg.so ./jre/lib/i386/libpreemptive_close.so ./jre/lib/i386/libjavaplugin_jni.so ./jre/lib/i386/libawt.so ./jre/lib/i386/libcmm.so ./jre/lib/i386/classic/libjvm.so ./jre/plugin/i386/ns4/javaplugin.so ./jre/plugin/i386/ns600/libjavaplugin_oji.so ./lib/i386/libjdwp.so ./lib/i386/libdt_socket.so - A ./configure --with-aolserver-src=../aolserver-3.4.2 outputs: checking prefix... /usr/local checking for gcc... gcc checking for C compiler default output... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking looking for aolserver ns.h include file... /root/download/aolserver/3.4_patched/aolser ver-3.4.2/include/ns.h checking looking for aolserver nsd.h include file... /root/download/aolserver/3.4_patched/aolse rver-3.4.2/nsd/nsd.h checking for jdk version... grep: /jni.h: No such file or directory Looks like you are using a 1.1 JVM NSJAVA version 1.0.5 configured successfully. Using c-compiler: gcc JDK Version: 1.1 Install libnsjava.so: /usr/local/bin Aolserver Version: 3_0_PLUS Aolserver include: /aolserver-3.4.2/include Aolserver nsd include: /aolserver-3.4.2/nsd jdk include: jdk platform include: libjava.so dir: libhpi.so dir: hpi lib: libjvm.so dir: jvm lib: java classpath: RPATH: CFLAGS: '' LDFLAGS: '-L -ljava ' configure: creating ./config.status config.status: creating Makefile.global Thanks a lot for your help! wiwo -- Digital Concepts Ideen-Konzepte-Lösungen [EMAIL PROTECTED] www.digital-concepts.com Mobil: +43 699 / 20 88 13 51 Büro: +43 732 / 77 27 27
Re: [AOLSERVER] nsjava and aolserver 3.4.2
On 2003.01.27, Wolfgang Winkler [EMAIL PROTECTED] wrote: A ./configure --with-aolserver-src=../aolserver-3.4.2 outputs: [...] It'd be handy if the nsjava configure script echoed the $JAVA_HOME setting ... I wonder if it was set when you ran the 'configure' script: checking for jdk version... grep: /jni.h: No such file or directory [...] jdk include: jdk platform include: libjava.so dir: libhpi.so dir: hpi lib: libjvm.so dir: jvm lib: java classpath: RPATH: -- Dossy -- Dossy Shiobara mail: [EMAIL PROTECTED] Panoptic Computer Network web: http://www.panoptic.com/ He realized the fastest way to change is to laugh at your own folly -- then you can let go and quickly move on. (p. 70)
Re: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung
Title: RE: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung The error catching concept is definitely wise. In fact I'll go ahead and put those in. This code is sort of legacy and old but it's definitely worth revising it. The reason I don't see this might be the source of the issue, is that the same thing works (and is been working) with the older version of AOLserver which we have been using for years. Although there might be another hole or a different config that is causing the ns_muext to show up as the problem. Regarding the error handling in this code, as you see, the only thing is between the lock/unlock block is just incrementing the arrays, and also the database action takes places after unlocking. Since the existence of the arrays also is being tested and takes place before attempting to use ns_mutex, I'm assuming that no error could cause the ns_mutex unlock to be skipped because of an exception, plus nothing shows up in the error log either. These being said, still I'll try to put a catch block anywhere between the ns_mutex lock/unlock, blocks in the code. I'd also like to try Nathan's mutexmeter solution to see if I find anything new. Thanks for the advices, Seena -Original Message- From: Andrew Piskorski [mailto:[EMAIL PROTECTED]] Sent: Monday, January 27, 2003 2:06 PM To: [EMAIL PROTECTED] Subject: Re: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung On Mon, Jan 27, 2003 at 12:23:50PM -0600, Rob Mayoff wrote: So the next logical step (other than creating a test case) is to test the hypothesis that such is the case, by putting catch commands around your critical sections. For example, suppose the critical section looks like this: ns_mutex lock L SCRIPT ns_mutex unlock L Then you should change that to this: ns_mutex lock L set code [catch { SCRIPT } result] ns_mutex unlock L This is good advice, and not just for debugging! When I run ANY with a mutex locked that could ever possibly error out, I always wrap it in a catch to properly clean up the mutex on error. E.g.: ns_mutex lock $data_mutex if { [catch { error Foo! } errmsg] } { # We caught an unexpected error while the mutex was locked, so # unlock the mutex, then re-throw the error: ns_mutex unlock $data_mutex global errorInfo set my_error $errorInfo error $my_error } ns_mutex unlock $data_mutex -- Andrew Piskorski [EMAIL PROTECTED] http://www.piskorski.com
Re: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung
In a message dated 1/27/2003 2:22:09 PM Eastern Standard Time, [EMAIL PROTECTED] writes: Regarding the error handling in this code, as you see, the only thing is between the lock/unlock block is just incrementing the arrays, and also the database action takes places after unlocking. Since the existence of the arrays also is being tested and takes place before attempting to use ns_mutex, I'm assuming that no error could cause the ns_mutex unlock to be skipped because of an exception, plus nothing shows up in the error log either. careful - you might have a race condition. consider this scenerio: THREAD 1: - check for existance of array(key) - lock - do something with array(key) - unlock THREAD 2: - unset array(key) thread 2 could unset your array after you've checked for its existance, and before you did something with it. to fix the scenerio above you'd need to lock around all access to your array and move the check for existance inside the lock as well: THREAD 1: - lock - check for existance of array(key) - do something with array(key) - unlock THREAD 2: - lock - unset array(key) - unlock better still is to catch and handle errors around code which acquires a mutex lock. this allows you to properly unlock and prevents dead lock situations where you've acquired a lock, an error occurs, and you never release the lock. one other note about the nsv_incr command. in versions prior to 4.0 you need to first initialize the the nsv array and variable you are incrementing: nsv_set myArray counter 0 nsv_incr myArray counter in 4.0 the nsv_incr will create and initialize the array and variable if it doesn't already exist: nsv_incr myArray counter - nathan
Re: [AOLSERVER] nsjava and aolserver 3.4.2
The $JDK_DIR var in the configure script is set to $JAVA_HOME. (/usr/lib/java). I've checked it already. It seems that configure doesn't recognize the right Java version. On Monday 27 January 2003 20:07, you wrote: On 2003.01.27, Wolfgang Winkler [EMAIL PROTECTED] wrote: A ./configure --with-aolserver-src=../aolserver-3.4.2 outputs: [...] It'd be handy if the nsjava configure script echoed the $JAVA_HOME setting ... I wonder if it was set when you ran the 'configure' script: checking for jdk version... grep: /jni.h: No such file or directory [...] jdk include: jdk platform include: libjava.so dir: libhpi.so dir: hpi lib: libjvm.so dir: jvm lib: java classpath: RPATH: -- Dossy -- Digital Concepts Ideen-Konzepte-Lösungen [EMAIL PROTECTED] www.digital-concepts.com Mobil: +43 699 / 20 88 13 51 Büro: +43 732 / 77 27 27
[AOLSERVER] AOLServer 4.0 hanging when killed...
I have installed OpenACS with AOLServer 4.0beta1 and it all seems to work well. The only remaining issue I see is that when I kill the server it hangs at the check for event threads and never actually dies. It does not happen with 3.3+ad13 or 3.5.2 on the same server (well the same once nsdb.so removed from the config). This is using postgres. I am not sure how to check what event threads it's actually waiting for so don't know how to track this down. Although I am willing to admit it's our fault and not aolserver's :) [27/Jan/2003:18:43:08][31334.1024][-main-] Notice: nsmain: AOLserver/4.0 stopping [27/Jan/2003:18:43:08][31334.1024][-main-] Notice: serv: stopping server: oatest4 [27/Jan/2003:18:43:08][31334.1024][-main-] Notice: serv: connection threads stopped [27/Jan/2003:18:43:08][31334.1024][-main-] Notice: sched: shutdown pending [27/Jan/2003:18:43:08][31334.2051][-sched-] Notice: sched: shutdown started [27/Jan/2003:18:43:08][31334.2051][-sched-] Notice: sched: waiting for event threads... ...sits here indefinitely...
Re: [AOLSERVER] AOLServer 4.0 hanging when killed...
Jeff, How are you controlling AOLserver shutdown? --Tom Jackson Jeff Davis wrote: I have installed OpenACS with AOLServer 4.0beta1 and it all seems to work well. The only remaining issue I see is that when I kill the server it hangs at the check for event threads and never actually dies. It does not happen with 3.3+ad13 or 3.5.2 on the same server (well the same once nsdb.so removed from the config). This is using postgres. I am not sure how to check what event threads it's actually waiting for so don't know how to track this down. Although I am willing to admit it's our fault and not aolserver's :) [27/Jan/2003:18:43:08][31334.1024][-main-] Notice: nsmain: AOLserver/4.0 stopping [27/Jan/2003:18:43:08][31334.1024][-main-] Notice: serv: stopping server: oatest4 [27/Jan/2003:18:43:08][31334.1024][-main-] Notice: serv: connection threads stopped [27/Jan/2003:18:43:08][31334.1024][-main-] Notice: sched: shutdown pending [27/Jan/2003:18:43:08][31334.2051][-sched-] Notice: sched: shutdown started [27/Jan/2003:18:43:08][31334.2051][-sched-] Notice: sched: waiting for event threads... ...sits here indefinitely...
Re: [AOLSERVER] AOLServer 4.0 hanging when killed...
In a message dated 1/27/2003 3:36:18 PM Eastern Standard Time, [EMAIL PROTECTED] writes: I am not sure how to check what event threads it's actually waiting for so don't know how to track this down. Although I am willing to admit it's our fault and not aolserver's :) sounds like you might be blocked on that condition broadcast waiting for detached event threads to exit... any chance you could attach with a debugger and dig around a little? you'll want to look at sched.c around line 845. also, what os are you running on? thanks! - nathan
Re: [AOLSERVER] nsjava and aolserver 3.4.2
Now I've found out something interesting. My java installation isn't found because my JAVA_HOME points to /usr/lib/java instead of /usr/lib/java/ (note the trailing slash). The line JAVA_INCLUDE=`find $JDK_DIR -name jni.h | grep -vi old | sed -e 's/\/jni\.h//'` can't find anything because the find command does not work without the trailing slash. Everything compiles fine now without modifications, but when I try to start aolserver I get this: Notice: ModuleInit: nsjava ENABLED registerTclJavaFunctions: registering commands /server.sh: line 3: 28177 Segmentation fault /usr/local/aolserver/bin/nsd -tf /usr/local/aolserver/server.tcl -u nsadmin regards, wiwo -- Digital Concepts Ideen-Konzepte-Lösungen [EMAIL PROTECTED] www.digital-concepts.com Mobil: +43 699 / 20 88 13 51 Büro: +43 732 / 77 27 27
[AOLSERVER] AS 4.0 : How to stop the server
Hi, Testing the new AS 4.0 beta1, I don't know how to shutdown the server. With 3.2, I am using -K on the command line, but on 4.0 I am getting Error: invalid option: -K. If someone has the solution... Thanks ! Jean-Fabrice RABAUTE Core Services http://www.core-services.fr Mob: +33 (0)6 13 82 67 67
Re: [AOLSERVER] nsjava and aolserver 3.4.2
Aolserver bails out at this line in nsjava.c (around line number 205): Ns_ThreadCreate(startJvm, NULL, 0, jvm_thread); Has anyone an idea how this could be solved? Regards, wiwo -- Digital Concepts Ideen-Konzepte-Lösungen [EMAIL PROTECTED] www.digital-concepts.com Mobil: +43 699 / 20 88 13 51 Büro: +43 732 / 77 27 27
Re: [AOLSERVER] AOLServer 4.0 hanging when killed...
We have shutdown problems all the time with 3.4.2. Just another data point. Jim --part1_ab.2896ff19.2b66f56c_boundary Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit In a message dated 1/27/2003 3:36:18 PM Eastern Standard Time, [EMAIL PROTECTED] writes: I am not sure how to check what event threads it's actually waiting for so don't know how to track this down. Although I am willing to admit it's our fault and not aolserver's :) sounds like you might be blocked on that condition broadcast waiting for detached event threads to exit... any chance you could attach with a debugger and dig around a little? you'll want to look at sched.c around line 845. also, what os are you running on? thanks! - nathan --part1_ab.2896ff19.2b66f56c_boundary Content-Type: text/html; charset=US-ASCII Content-Transfer-Encoding: 7bit HTMLFONT FACE=arial,helveticaFONT SIZE=2 FAMILY=SANSSERIF FACE=Arial LANG=0In a message dated 1/27/2003 3:36:18 PM Eastern Standard Time, [EMAIL PROTECTED] writes:BR BR BLOCKQUOTE TYPE=CITE style=BORDER-LEFT: #ff 2px solid; MARGIN-LEFT: 5px; MARGIN-RIGHT: 0px; PADDING-LEFT: 5pxI am not sure how to check what event threads it's actually waitingBR for so don't know how to track this down.nbsp; Although I am willing toBR admit it's our fault and not aolserver's :)/BLOCKQUOTEBR BR sounds like you might be blocked on that condition broadcast waiting for detached event threads to exit... any chance you could attach with a debugger and dig around a little? you'll want to look at sched.c around line 845.BR BR also, what os are you running on? thanks!BR BR - nathan/FONT/HTML --part1_ab.2896ff19.2b66f56c_boundary--
[AOLSERVER] JVM Error
Hello, I work on JK2 family module for AOLserver and use JNI to communicate with Tomcat, so the JVM is running inside the Web server process. I experience a problem with JVM 1.3.1 on Solaris 2.7/E250 and wonder if anyone else has encountered a similar issue. It manifests itself as JVM crashing, writing a message to the tune HotSpot Virtual Machine Error/Cannot obtain thread information and dumping core due to SIGSEGV. It appears that I can reproduce the error by launching AOLserver with JVM in-process, executing a couple of requests and then letting the server sit idle for about 10 minutes. Subsequent request to Tomcat has a good chance of crashing the server. Examining core in dbx reveals that SEGV always happens in AOLserver request thread, bound to JVM by virtue of using a JNI channel. I can trace the signal to a C++ method, Thread* ThreadLocalStorage::get_thread_via_cache(), local to libjvm.so (debugging data is available if anyone is interested). What could be causing this problem and how can I correct it? I also see a SEGV during JVM initialization. JVM establishes signal handlers to trap and absorb signals, so this one doesn't cause any visible damage. I detect it in dbx and truss, and it seems to originate in JRE library libnet.so evaluating equivalent of C expression *(int*)0. I am puzzled because it looks like intentional dereference of NULL pointer. Your help is appreciated! Alex Typical problem report for the crash: Unexpected Signal : 11 occurred at PC=0xfdcbc5c4 Function name=JVM_Clone Library=/opt/java/jre/lib/sparc/server/libjvm.so Cannot obtain thread information Dynamic libraries: 0x1 bin/nsd 0xff38 /usr/lib/libsocket.so.1 0xff28 /usr/lib/libnsl.so.1 0xff3a /usr/lib/libdl.so.1 0xff35 /usr/lib/librt.so.1 0xff24 /usr/lib/libthread.so.1 0xff20 /usr/lib/libresolv.so.2 0xff1d /usr/lib/libm.so.1 0xff10 /usr/lib/libc.so.1 0xff0e /usr/lib/libmp.so.2 0xff0c /usr/lib/libaio.so.1 0xff33 /usr/platform/SUNW,Ultra-250/lib/libc_psr.so.1 0xff03 /home/aleykekh/aolserver-3.4-opt/bin/nssock.so 0xfe7e /home/aleykekh/aolserver-3.4-opt/bin/nslog.so 0xfe7c /home/aleykekh/aolserver-3.4-opt/bin/nscgi.so 0xfe75 /home/aleykekh/newtomcat/jakarta-tomcat-connectors-4.1.12-src/jk/build/jk2/aolserver/libnsjk2.so 0xfe70 /home/aleykekh/newtomcat/apache2/lib/libapr-0.so.0 0xfe6c /home/aleykekh/newtomcat/apache2/lib/libaprutil-0.so.0 0xfe68 /usr/local/lib/libexpat.so.0 0xfdc0 /opt/java/jre/lib/sparc/server/libjvm.so 0xfe65 /usr/lib/libCrun.so.1 0xff05 /usr/lib/libw.so.1 0xfe61 /export/0/j2se-1.3.1/jre/lib/sparc/native_threads/libhpi.so 0xfe3d /export/0/j2se-1.3.1/jre/lib/sparc/libverify.so 0xfe39 /export/0/j2se-1.3.1/jre/lib/sparc/libjava.so 0xfe35 /export/0/j2se-1.3.1/jre/lib/sparc/libzip.so 0xfe31 /usr/lib/nss_files.so.1 0xfe5e /home/aleykekh/aolserver-3.4-opt/bin/nscp.so 0xfa7e /export/0/j2se-1.3.1/jre/lib/sparc/libnet.so Local Time = Fri Jan 24 16:28:43 2003 Elapsed Time = 992 # # HotSpot Virtual Machine Error : 11 # Error ID : 4F530E43505002CC 01 # Please report this error at # http://java.sun.com/cgi-bin/bugreport.cgi # The Function name at the top of report varies, eg., JVM_GetMethodIxArgsSize, JVM_RawMonitorEnter.
Re: [AOLSERVER] AOLServer 4.0 hanging when killed...
Nathan, I will check it out in the debugger. I tried to reproduce it just now, but after running for a few minutes (and a couple sceduled procs) it did not hang on shutdown. I will let it run for a while and see if it happens again (it had happened the last several times I restarted though). In the debbugger nThreads was 0 every time I restarted so obviously whatever it was waiting for does not show up immediately. Annoying. Also, this is on redhat 7.3 kernel 2.4.19.
[AOLSERVER] Offtopic dumb security policy example, at um, aol
I decided to use a very old netscape mail account to send an anonymous anthrax warning (kidding). This is now run by aol. Anyway, I did appropriate things, found the account, logged in, and visited the options/preferences and changed the alternative email address from an address at an ISP that hasn't existed for about three years to a new and improved (with enzymes!) address. I was very surprised to find a new message, A CONFIRMATION MESSAGE, in that alternative email address's mailbox. I've attached it below. Jeez! Just what is this confirming? Jerry P.S. On an unrelated note, does anyone have Steve Case's email address? I um, would like to find it before he loses fiduciary powers. Screen Name Service wrote: Dear jerryasher, Reply to this e-mail to confirm your e-mail address change from [EMAIL PROTECTED] to [EMAIL PROTECTED] In the reply, type 'OK'. Please send your confirmation within 72 hours. The e-mail address change will be made for the following screen name: jerryasher
Re: [AOLSERVER] AOLServer 4.0 hanging when killed...
Well, waiting a while made it unstoppable... One thing that I noticed was this -sched:idle0- thread which was not there at the start and only showed up later. name parenttid flags ctime proc arg -conn:oatest4::0 -driver- 41010 Mon, 27 Jan 2003 21:33:50 GMT ns:connthread 6 127.0.0.1 running GET /t.adp 0.71483 0 -driver- -main-30760 Mon, 27 Jan 2003 21:21:51 GMT p:0x4002e5aca:0x0 -main-10241 Mon, 27 Jan 2003 21:21:46 GMT p:0x0 a:0x0 -sched- -main-20510 Mon, 27 Jan 2003 21:21:46 GMT p:0x4003f2aca:0x0 -sched:idle0- -sched- 61500 Mon, 27 Jan 2003 21:36:50 GMT p:0x4003f05ca:0x0 Here is some information from gdb (attached to the server after it has hung): (gdb) attach 30360 ... (gdb) where #0 0x420292e5 in sigsuspend () from /lib/i686/libc.so.6 #1 0x4013f679 in __pthread_wait_for_restart_signal () from /lib/i686/libpthread.so.0 #2 0x401414b9 in __pthread_alt_lock () from /lib/i686/libpthread.so.0 #3 0x4013e116 in pthread_mutex_lock () from /lib/i686/libpthread.so.0 #4 0x4013c2b0 in pthread_cond_timedwait_relative () from /lib/i686/libpthread.so.0 #5 0x4006ea47 in Ns_CondTimedWait (cond=0x4006739c, mutex=0x40067398, timePtr=0xb8b8) at pthread.c:668 #6 0x4003eb95 in NsWaitSchedShutdown (toPtr=0xb8b8) at sched.c:499 #7 0x40038d48 in Ns_Main (argc=4, argv=0xb964, initProc=0x8048674 ServerInit) at nsmain.c:512 #8 0x08048668 in main (argc=4, argv=0xb964) at main.c:64 #9 0x42017499 in __libc_start_main () from /lib/i686/libc.so.6 (gdb) info thread 6 Thread 6150 (LWP 30952) 0x420292e5 in sigsuspend () from /lib/i686/libc.so.6 5 Thread 3076 (LWP 30374) 0x420e0037 in poll () from /lib/i686/libc.so.6 4 Thread 2051 (LWP 30363) 0x420292e5 in sigsuspend () from /lib/i686/libc.so.6 3 Thread 1026 (LWP 30362) 0x420e187e in select () from /lib/i686/libc.so.6 2 Thread 2049 (LWP 30361) 0x420e0037 in poll () from /lib/i686/libc.so.6 1 Thread 1024 (LWP 30360) 0x420292e5 in sigsuspend () from /lib/i686/libc.so.6 (gdb) thread apply all where Thread 6 (Thread 6150 (LWP 30952)): #0 0x420292e5 in sigsuspend () from /lib/i686/libc.so.6 #1 0x4013f679 in __pthread_wait_for_restart_signal () from /lib/i686/libpthread.so.0 #2 0x401414b9 in __pthread_alt_lock () from /lib/i686/libpthread.so.0 #3 0x4013e116 in pthread_mutex_lock () from /lib/i686/libpthread.so.0 #4 0x4013bfb1 in pthread_cond_wait () from /lib/i686/libpthread.so.0 #5 0x4006e986 in Ns_CondWait (cond=0x400673a0, mutex=0x40067398) at pthread.c:616 #6 0x4003f0e8 in EventThread (arg=0x0) at sched.c:668 #7 0x4006cff7 in NsThreadMain (arg=0x81875f0) at thread.c:224 #8 0x4006eb02 in ThreadMain (arg=0x81875f0) at pthread.c:730 #9 0x4013cfef in pthread_start_thread () from /lib/i686/libpthread.so.0 Thread 5 (Thread 3076 (LWP 30374)): #0 0x420e0037 in poll () from /lib/i686/libc.so.6 #1 0x4002e87f in DriverThread (ignored=0x0) at driver.c:753 #2 0x4006cff7 in NsThreadMain (arg=0x808bc20) at thread.c:224 #3 0x4006eb02 in ThreadMain (arg=0x808bc20) at pthread.c:730 #4 0x4013cfef in pthread_start_thread () from /lib/i686/libpthread.so.0 Thread 4 (Thread 2051 (LWP 30363)): #0 0x420292e5 in sigsuspend () from /lib/i686/libc.so.6 #1 0x4013f679 in __pthread_wait_for_restart_signal () from /lib/i686/libpthread.so.0 #2 0x4013c85a in pthread_join () from /lib/i686/libpthread.so.0 #3 0x4006e729 in Ns_ThreadJoin (thread=0x8162e58, argPtr=0x0) at pthread.c:395 #4 0x4003f66a in SchedThread (ignored=0x0) at sched.c:850 #5 0x4006cff7 in NsThreadMain (arg=0x8051370) at thread.c:224 #6 0x4006eb02 in ThreadMain (arg=0x8051370) at pthread.c:730 #7 0x4013cfef in pthread_start_thread () from /lib/i686/libpthread.so.0 Thread 3 (Thread 1026 (LWP 30362)): #0 0x420e187e in select () from /lib/i686/libc.so.6 #1 0x40131c38 in __DTOR_END__ () from /usr/local/aolserver40//lib/libtcl8.4g.so #2 0x4013cfef in pthread_start_thread () from /lib/i686/libpthread.so.0 Thread 2 (Thread 2049 (LWP 30361)): #0 0x420e0037 in poll () from /lib/i686/libc.so.6 #1 0x4013cc70 in __pthread_manager () from /lib/i686/libpthread.so.0 Thread 1 (Thread 1024 (LWP 30360)): #0 0x420292e5 in sigsuspend () from /lib/i686/libc.so.6 #1 0x4013f679 in __pthread_wait_for_restart_signal () from /lib/i686/libpthread.so.0 #2 0x401414b9 in __pthread_alt_lock () from /lib/i686/libpthread.so.0 #3 0x4013e116 in pthread_mutex_lock () from /lib/i686/libpthread.so.0 #4 0x4013c2b0 in pthread_cond_timedwait_relative () from /lib/i686/libpthread.so.0 #5 0x4006ea47 in Ns_CondTimedWait (cond=0x4006739c, mutex=0x40067398, timePtr=0xb8b8) at pthread.c:668 #6 0x4003eb95 in NsWaitSchedShutdown (toPtr=0xb8b8) at sched.c:499 #7 0x40038d48 in Ns_Main (argc=4, argv=0xb964, initProc=0x8048674 ServerInit) at nsmain.c:512 #8 0x08048668 in main (argc=4, argv=0xb964) at
Re: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung
Nathan - If you look at the code it does lock before attempting to any manipulation to that array. # ns_share counter_Ans_share counter_Bns_share -init { set counter_mutex [ns_mutex create] } counter_mutex proc X {i} { ns_share counter_Ans_share counter_Bns_share counter_mutex ns_mutex lock $counter_mutex incr counter_A($i) 1incr counter_B($i) 1ns_mutex unlock $counter_mutex } proc_doc Y {} { ns_share counter_Ans_share counter_Bns_share counter_mutex ns_mutex lock $counter_mutexforeach i_index [array names counter_A] {set temp_counter_A($i_index) $counter_A($i_index)set temp_counter_B($i_index) $counter_B($i_index)unset counter_A($i_index)unset counter_B($i_index)} ns_mutex unlock $counter_mutex## writing $temp_counter_A and $temp_counter_B arrays to database} # -Original Message-From: Nathan Folkman [mailto:[EMAIL PROTECTED]]Sent: Monday, January 27, 2003 2:40 PMTo: [EMAIL PROTECTED]Subject: Re: [AOLSERVER] ns_mutex is likely causing our AOL web server to hungIn a message dated 1/27/2003 2:22:09 PM Eastern Standard Time, [EMAIL PROTECTED] writes: Regarding the error handling in this code, as you see, the only thing is between the lock/unlock block is just incrementing the arrays, and also the database action takes places after unlocking. Since the existence of the arrays also is being tested and takes place before attempting to use ns_mutex, I'm assuming that no error could cause the ns_mutex unlock to be skipped because of an exception, plus nothing shows up in the error log either.careful - you might have a race condition. consider this scenerio:THREAD 1:- check for existance of array(key)- lock- do something with array(key)- unlockTHREAD 2:- unset array(key)thread 2 could unset your array after you've checked for its existance, and before you did something with it. to fix the scenerio above you'd need to lock around all access to your array and move the check for existance inside the lock as well:THREAD 1:- lock- check for existance of array(key)- do something with array(key)- unlockTHREAD 2:- lock- unset array(key)- unlockbetter still is to catch and handle errors around code which acquires a mutex lock. this allows you to properly unlock and prevents dead lock situations where you've acquired a lock, an error occurs, and you never release the lock.one other note about the nsv_incr command. in versions prior to 4.0 you need to first initialize the the nsv array and variable you are incrementing:nsv_set myArray counter 0nsv_incr myArray counterin 4.0 the nsv_incr will create and initialize the array and variable if it doesn't already exist:nsv_incr myArray counter- nathan
Re: [AOLSERVER] ns_mutex is likely causing our AOL web server to hung
In a message dated 1/27/2003 6:15:50 PM Eastern Standard Time, [EMAIL PROTECTED] writes: Nathan - If you look at the code it does lock before attempting to any manipulation to that array. Just making sure. ;-) Any luck with the nsv_incr approach or any more data from a server running with mutex metering on? - Nathan
Re: [AOLSERVER] nsjava and aolserver 3.4.2
What does the backtrace look like in gdb? Regards, Dan Wolfgang Winkler wrote: Aolserver bails out at this line in nsjava.c (around line number 205): Ns_ThreadCreate(startJvm, NULL, 0, jvm_thread); Has anyone an idea how this could be solved? Regards, wiwo -- Digital Concepts Ideen-Konzepte-Lösungen [EMAIL PROTECTED] www.digital-concepts.com Mobil: +43 699 / 20 88 13 51 Büro: +43 732 / 77 27 27
Re: [AOLSERVER] Offtopic dumb security policy example, at um, aol
On 2003.01.27, Jerry Asher [EMAIL PROTECTED] wrote: I was very surprised to find a new message, A CONFIRMATION MESSAGE, in that alternative email address's mailbox. I've attached it below. Jeez! Just what is this confirming? Think: Someone creates a Netscape mail account. They sign themselves up for a whole slew of mailing lists, acknowledging the subscription confirmations. Then, they point the alternative email address at someone they dislike. Now, this poor sod is stuck receiving a whole boatload of email they didn't sign up for, from lists they probably don't know how to unsubscribe from ... Confirming that alternative email address should, presumably, stop this potential abuse. The target sod just refuses to acknowledge the confirmation, and that should stop the process right there ... -- Dossy -- Dossy Shiobara mail: [EMAIL PROTECTED] Panoptic Computer Network web: http://www.panoptic.com/ He realized the fastest way to change is to laugh at your own folly -- then you can let go and quickly move on. (p. 70)
Re: [AOLSERVER] Offtopic dumb security policy example, at um, aol
Dossy wrote: On 2003.01.27, Jerry Asher wrote: I was very surprised to find a new message, A CONFIRMATION MESSAGE, in that alternative email address's mailbox. I've attached it below. Jeez! Just what is this confirming? Think: Someone creates a Netscape mail account. They sign themselves up for a whole slew of mailing lists, acknowledging the subscription confirmations. Then, they point the alternative email address at someone they dislike. Now, this poor sod is stuck receiving a whole boatload of email they didn't sign up for, from lists they probably don't know how to unsubscribe from ... Confirming that alternative email address should, presumably, stop this potential abuse. The target sod just refuses to acknowledge the confirmation, and that should stop the process right there ... -- Dossy Sure that keeps the new email address from being screwed up, but at the risk of hijacking someone's email away from them. So I figure out Dossy's aolscreen name, I cons together your daughters name and guess your password, and then I get to steal your email. If I am getting email I don't want I do have someways of tracking it down and stopping it. If your email just stops one day and you don't realize it for a day, or a week, what are you going to do? How will you ever fix that? Again, anyone have the aol email addresses for the folks on mahogany row? Jerry