Re: [AOLSERVER] AOLserver crash related to ns_atclose and namespace commands

2007-01-22 Thread Tom Jackson
On Sunday 21 January 2007 23:20, Brett Schwarz wrote:
 That's funny actually...I just changed a bunch of these cases in a Tcl
 extension I help maintain, just earlier today. I happened upon this post
 that talks about it:
 http://sourceforge.net/mailarchive/forum.php?thread_id=30611212forum_id=43
966

 Might be worthwhile doing an audit of the rest of the aolserver code for
 these occurances.

I only found a few in the AOLserver code, I changed about half before I found 
the one that stopped the bug. 

I even changed one in the tcl codebase that uses this while checking if a 
namespace exists. 

I have a feeling that the bug shows up for some other reason. ns_atclose 
stores scripts and uses a hash array. I'm guessing that two identical scripts 
might appear as one at some point. This could change the reference count for 
the object, somehow leading to the problem.

tom jackson


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


[AOLSERVER] AOLserver crash related to ns_atclose and namespace commands

2007-01-21 Thread Tom Jackson
I have been getting some crashes in AOLserver (current cvs version).
AOLserver doesn't exit, but prints the following and stops responding:

'Tcl_SetBooleanObj called with shared object'

Here is a tcl page which exposes the behavior:

---
# Script to expose bug with ns_atclose/namespace commands
set store ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghijklmnop
namespace eval ::bug { }

# Commenting out this line leads to bug: 'Tcl_SetBooleanObj called with shared 
object'
#namespace eval ::bug::$store { }

proc ::bug::atClose { store } {
ns_log Debug checking if namespace ::bug::$store exists
if {[namespace exists ::bug::${store}]} {
ns_log Debug Deleting namespace ::bug::$store
namespace delete ::bug::${store}
#log Notice Closed store (memory delete) $store
return $store
} else {
ns_log Debug namespace ::bug::$store does not exist
}

}

# Comment out one of these and things work fine:
ns_atclose ::bug::atClose $store
#ns_atclose ::bug::atClose $store


ns_return 200 text/plain ns_atclose bug

-

The bug doesn't show up under all conditions. If the namespace exists, or had 
existed and was deleted, things work as expected. Also, even if the namespace 
never existed, if ns_atclose is only called once, things work as expected.

However, if the namespace to be deleted never existed, and ns_atclose is 
called twice with the same args, none of the ns_log Debug statements print, 
and the crash occurs. (But the page is returned)

Not sure what is the cause.

tom jackson

On Friday 03 November 2006 10:31, Alex wrote:
 Oh, well

 so I guess it was too early to celebrate. Now I am getting the same
 crashes again, even without exit command in the tcl code executed in
 thread.

 Seems to me that the same problem now discussed in
 bug 1589968
 https://sourceforge.net/tracker/?func=detailatid=103152aid=1589968group_
id=3152

 and

 bug 1582671
 http://sourceforge.net/tracker/?func=detailatid=110894aid=1582671group_i
d=10894


 Thanks,
 ~ Alex.

 On 11/1/06, Alex [EMAIL PROTECTED] wrote:
  Zoran, Jim
 
  thanks very much for suggestions!
  I think I figured it out.
  The code which was executing in the thread concluded with exit tcl
  command. I got it replaced with return and it seems not to be crashing
  anymore.
 
  However, it would be probably a good idea to disable/rename exit for
  the code executed in threads created by ns_thread. Not sure if this
  shall be submitted as an enhancement-level bug.
 
  Thanks,
  ~ Alex.
 
  On 11/1/06, Alex [EMAIL PROTECTED] wrote:
   Jim,
  
   I tried in on the command line, seems to be my case :)
  
   However, I run aolserver on debian, via /etc/init.d/aolserver,
   Which basically invokes /usr/lib/aolserver4/bin/nsd.
   How do I make it use nstclsh instead of tclsh ?
   I don't see any options for that.
  
   Thanks,
   ~ Alex.
  
   On 11/1/06, Jim Davidson [EMAIL PROTECTED] wrote:
Hi,
   
I think this is related to the comment I added to the RELEASE notes:
   
* Loading libnsd into a tclsh and then creating new threads with
the ns_thread command will result in a crash when those threads
exit. The issues has to do with finalization of the async-cancel
context used to support the new ns_ictl cancel feature.  This bug
is not present when using the nstclsh binary.
   
   
The issue above, where Tcl is initialized before AOLserver by loading
libnsd into tclsh, results in Tcl thread local storage being
finalized before AOLserver's context which includes a pointer to an
async handler.
   
Now, that's not what you're doing here but perhaps TclX is having the
same effect.  I haven't looked at TclX for sometime so I can't recall
what it would be using an async handler for -- perhaps you could dig
through the code and comment it out as the async handler stuff was
really designed for Unix signal-related things which aren't common in
multi-threaded AOLserver.
   
Alternatively, Tcl could be fixed to avoid freeing itself before
AOLserver or any other extension.  Unfortunately, that could be a big
job -- the Tcl core is already riddled with a lot of code to try to
manage the order of finalization.
   
-Jim
   
On Nov 1, 2006, at 5:35 PM, Zoran Vasiljevic wrote:
 On 01.11.2006, at 23:27, Alex wrote:
 Hi,

 I am getting yet another crash in AOLServer 4.5.0.
 This time it crashes after exiting from threads started with
 ns_thread begin or ns_thread begindetached.

 Any Suggestions?

 Thanks,
 ~ Alex.

 Program received signal SIGSEGV, Segmentation fault.
 [Switching to Thread 1086359904 (LWP 19612)]
 0x2ae6c2a7 in Tcl_AsyncDelete (async=0x54e6c0) at
 /srv/DIST/tcl8.4.13/unix/../generic/tclAsync.c:297
 297 while (prevPtr-nextPtr != asyncPtr) {
 (gdb) back
 #0  0x2ae6c2a7 in Tcl_AsyncDelete (async=0x54e6c0) at
 

Re: [AOLSERVER] AOLserver crash related to ns_atclose and namespace commands

2007-01-21 Thread Tom Jackson
Okay, some more info on this.

ns_atclose has been changed in some strange ways.

First it now requires that you are in an open connection to invoke ns_atclose.

ns_atclose used to execute in scheduled procs, which makes sense so that you 
can use one method to clean up stuff in case of errors. 

It is easy to re-enable adding ns_atclose to scheduled procs by removing a few 
lines of code. Now I can call ns_atclose everywhere, but in scheduled procs, 
the cleanup scripts don't run.

Question is: why the (silent) change, and
is there something to replace this?

The old description of the command is here:
http://rmadilo.com/files/nsapi/ns_atclose.html

I still haven't figured out where exactly the crash is coming from, but _it is 
not in the NsAtCloseObjCmd or NsRunAtClose... code.

tom jackson

On Sunday 21 January 2007 11:24, Tom Jackson wrote:
 I have been getting some crashes in AOLserver (current cvs version).
 AOLserver doesn't exit, but prints the following and stops responding:

 'Tcl_SetBooleanObj called with shared object'

 Here is a tcl page which exposes the behavior:

 ---
 # Script to expose bug with ns_atclose/namespace commands
 set store ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghijklmnop
 namespace eval ::bug { }

 # Commenting out this line leads to bug: 'Tcl_SetBooleanObj called with
 shared object'
 #namespace eval ::bug::$store { }

 proc ::bug::atClose { store } {
 ns_log Debug checking if namespace ::bug::$store exists
 if {[namespace exists ::bug::${store}]} {
 ns_log Debug Deleting namespace ::bug::$store
 namespace delete ::bug::${store}
 #log Notice Closed store (memory delete) $store
 return $store
 } else {
 ns_log Debug namespace ::bug::$store does not exist
 }

 }

 # Comment out one of these and things work fine:
 ns_atclose ::bug::atClose $store
 #ns_atclose ::bug::atClose $store


 ns_return 200 text/plain ns_atclose bug

 -

 The bug doesn't show up under all conditions. If the namespace exists, or
 had existed and was deleted, things work as expected. Also, even if the
 namespace never existed, if ns_atclose is only called once, things work as
 expected.

 However, if the namespace to be deleted never existed, and ns_atclose is
 called twice with the same args, none of the ns_log Debug statements print,
 and the crash occurs. (But the page is returned)

 Not sure what is the cause.

 tom jackson

 On Friday 03 November 2006 10:31, Alex wrote:
  Oh, well
 
  so I guess it was too early to celebrate. Now I am getting the same
  crashes again, even without exit command in the tcl code executed in
  thread.
 
  Seems to me that the same problem now discussed in
  bug 1589968
  https://sourceforge.net/tracker/?func=detailatid=103152aid=1589968grou
 p_ id=3152
 
  and
 
  bug 1582671
  http://sourceforge.net/tracker/?func=detailatid=110894aid=1582671group
 _i d=10894
 
 
  Thanks,
  ~ Alex.
 
  On 11/1/06, Alex [EMAIL PROTECTED] wrote:
   Zoran, Jim
  
   thanks very much for suggestions!
   I think I figured it out.
   The code which was executing in the thread concluded with exit tcl
   command. I got it replaced with return and it seems not to be
   crashing anymore.
  
   However, it would be probably a good idea to disable/rename exit for
   the code executed in threads created by ns_thread. Not sure if this
   shall be submitted as an enhancement-level bug.
  
   Thanks,
   ~ Alex.
  
   On 11/1/06, Alex [EMAIL PROTECTED] wrote:
Jim,
   
I tried in on the command line, seems to be my case :)
   
However, I run aolserver on debian, via /etc/init.d/aolserver,
Which basically invokes /usr/lib/aolserver4/bin/nsd.
How do I make it use nstclsh instead of tclsh ?
I don't see any options for that.
   
Thanks,
~ Alex.
   
On 11/1/06, Jim Davidson [EMAIL PROTECTED] wrote:
 Hi,

 I think this is related to the comment I added to the RELEASE
 notes:

 * Loading libnsd into a tclsh and then creating new threads with
 the ns_thread command will result in a crash when those threads
 exit. The issues has to do with finalization of the async-cancel
 context used to support the new ns_ictl cancel feature.  This bug
 is not present when using the nstclsh binary.


 The issue above, where Tcl is initialized before AOLserver by
 loading libnsd into tclsh, results in Tcl thread local storage
 being finalized before AOLserver's context which includes a pointer
 to an async handler.

 Now, that's not what you're doing here but perhaps TclX is having
 the same effect.  I haven't looked at TclX for sometime so I can't
 recall what it would be using an async handler for -- perhaps you
 could dig through the code and comment it out as the async handler
 stuff was really designed for Unix signal-related things which
 aren't common in multi-threaded AOLserver.

 Alternatively, Tcl could 

Re: [AOLSERVER] AOLserver crash related to ns_atclose and namespace commands

2007-01-21 Thread Tom Jackson
I found the following change fixes the bug:

in nsd/tclresp.c, line 840:

static int
Result(Tcl_Interp *interp, int result)
{
   /* Tcl_SetBooleanObj(Tcl_GetObjResult(interp), result == NS_OK ? 1 : 0); */
Tcl_SetObjResult(interp, Tcl_NewBooleanObj((result == NS_OK ? 1 : 0)));
return TCL_OK;
}

I'll commit the change.

tom jackson


On Sunday 21 January 2007 17:06, Tom Jackson wrote:
 Okay, some more info on this.

 ns_atclose has been changed in some strange ways.

 First it now requires that you are in an open connection to invoke
 ns_atclose.

 ns_atclose used to execute in scheduled procs, which makes sense so that
 you can use one method to clean up stuff in case of errors.

 It is easy to re-enable adding ns_atclose to scheduled procs by removing a
 few lines of code. Now I can call ns_atclose everywhere, but in scheduled
 procs, the cleanup scripts don't run.

 Question is: why the (silent) change, and
 is there something to replace this?

 The old description of the command is here:
 http://rmadilo.com/files/nsapi/ns_atclose.html

 I still haven't figured out where exactly the crash is coming from, but _it
 is not in the NsAtCloseObjCmd or NsRunAtClose... code.

 tom jackson

 On Sunday 21 January 2007 11:24, Tom Jackson wrote:
  I have been getting some crashes in AOLserver (current cvs version).
  AOLserver doesn't exit, but prints the following and stops responding:
 
  'Tcl_SetBooleanObj called with shared object'
 
  Here is a tcl page which exposes the behavior:
 
  ---
  # Script to expose bug with ns_atclose/namespace commands
  set store ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghijklmnop
  namespace eval ::bug { }
 
  # Commenting out this line leads to bug: 'Tcl_SetBooleanObj called with
  shared object'
  #namespace eval ::bug::$store { }
 
  proc ::bug::atClose { store } {
  ns_log Debug checking if namespace ::bug::$store exists
  if {[namespace exists ::bug::${store}]} {
  ns_log Debug Deleting namespace ::bug::$store
  namespace delete ::bug::${store}
  #log Notice Closed store (memory delete) $store
  return $store
  } else {
  ns_log Debug namespace ::bug::$store does not exist
  }
 
  }
 
  # Comment out one of these and things work fine:
  ns_atclose ::bug::atClose $store
  #ns_atclose ::bug::atClose $store
 
 
  ns_return 200 text/plain ns_atclose bug
 
  -
 
  The bug doesn't show up under all conditions. If the namespace exists, or
  had existed and was deleted, things work as expected. Also, even if the
  namespace never existed, if ns_atclose is only called once, things work
  as expected.
 
  However, if the namespace to be deleted never existed, and ns_atclose is
  called twice with the same args, none of the ns_log Debug statements
  print, and the crash occurs. (But the page is returned)
 
  Not sure what is the cause.
 
  tom jackson
 
  On Friday 03 November 2006 10:31, Alex wrote:
   Oh, well
  
   so I guess it was too early to celebrate. Now I am getting the same
   crashes again, even without exit command in the tcl code executed in
   thread.
  
   Seems to me that the same problem now discussed in
   bug 1589968
   https://sourceforge.net/tracker/?func=detailatid=103152aid=1589968gr
  ou p_ id=3152
  
   and
  
   bug 1582671
   http://sourceforge.net/tracker/?func=detailatid=110894aid=1582671gro
  up _i d=10894
  
  
   Thanks,
   ~ Alex.
  
   On 11/1/06, Alex [EMAIL PROTECTED] wrote:
Zoran, Jim
   
thanks very much for suggestions!
I think I figured it out.
The code which was executing in the thread concluded with exit tcl
command. I got it replaced with return and it seems not to be
crashing anymore.
   
However, it would be probably a good idea to disable/rename exit
for the code executed in threads created by ns_thread. Not sure if
this shall be submitted as an enhancement-level bug.
   
Thanks,
~ Alex.
   
On 11/1/06, Alex [EMAIL PROTECTED] wrote:
 Jim,

 I tried in on the command line, seems to be my case :)

 However, I run aolserver on debian, via /etc/init.d/aolserver,
 Which basically invokes /usr/lib/aolserver4/bin/nsd.
 How do I make it use nstclsh instead of tclsh ?
 I don't see any options for that.

 Thanks,
 ~ Alex.

 On 11/1/06, Jim Davidson [EMAIL PROTECTED] wrote:
  Hi,
 
  I think this is related to the comment I added to the RELEASE
  notes:
 
  * Loading libnsd into a tclsh and then creating new threads with
  the ns_thread command will result in a crash when those threads
  exit. The issues has to do with finalization of the async-cancel
  context used to support the new ns_ictl cancel feature.  This
  bug is not present when using the nstclsh binary.
 
 
  The issue above, where Tcl is initialized before AOLserver by
  loading libnsd into tclsh, results in Tcl thread local storage
  being 

Re: [AOLSERVER] AOLserver crash related to ns_atclose and namespace commands

2007-01-21 Thread Brett Schwarz
That's funny actually...I just changed a bunch of these cases in a Tcl 
extension I help maintain, just earlier today. I happened upon this post that 
talks about it: 
 http://sourceforge.net/mailarchive/forum.php?thread_id=30611212forum_id=43966

Might be worthwhile doing an audit of the rest of the aolserver code for these 
occurances.

--brett

- Original Message 
From: Tom Jackson [EMAIL PROTECTED]
To: AOLSERVER@LISTSERV.AOL.COM
Sent: Sunday, January 21, 2007 7:17:41 PM
Subject: Re: [AOLSERVER] AOLserver crash related to ns_atclose and namespace 
commands

I found the following change fixes the bug:

in nsd/tclresp.c, line 840:

static int
Result(Tcl_Interp *interp, int result)
{
   /* Tcl_SetBooleanObj(Tcl_GetObjResult(interp), result == NS_OK ? 1 : 0); */
Tcl_SetObjResult(interp, Tcl_NewBooleanObj((result == NS_OK ? 1 : 0)));
return TCL_OK;
}

I'll commit the change.

tom jackson


On Sunday 21 January 2007 17:06, Tom Jackson wrote:
 Okay, some more info on this.

 ns_atclose has been changed in some strange ways.

 First it now requires that you are in an open connection to invoke
 ns_atclose.

 ns_atclose used to execute in scheduled procs, which makes sense so that
 you can use one method to clean up stuff in case of errors.

 It is easy to re-enable adding ns_atclose to scheduled procs by removing a
 few lines of code. Now I can call ns_atclose everywhere, but in scheduled
 procs, the cleanup scripts don't run.

 Question is: why the (silent) change, and
 is there something to replace this?

 The old description of the command is here:
 http://rmadilo.com/files/nsapi/ns_atclose.html

 I still haven't figured out where exactly the crash is coming from, but _it
 is not in the NsAtCloseObjCmd or NsRunAtClose... code.

 tom jackson

 On Sunday 21 January 2007 11:24, Tom Jackson wrote:
  I have been getting some crashes in AOLserver (current cvs version).
  AOLserver doesn't exit, but prints the following and stops responding:
 
  'Tcl_SetBooleanObj called with shared object'
 
  Here is a tcl page which exposes the behavior:
 
  ---
  # Script to expose bug with ns_atclose/namespace commands
  set store ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghijklmnop
  namespace eval ::bug { }
 
  # Commenting out this line leads to bug: 'Tcl_SetBooleanObj called with
  shared object'
  #namespace eval ::bug::$store { }
 
  proc ::bug::atClose { store } {
  ns_log Debug checking if namespace ::bug::$store exists
  if {[namespace exists ::bug::${store}]} {
  ns_log Debug Deleting namespace ::bug::$store
  namespace delete ::bug::${store}
  #log Notice Closed store (memory delete) $store
  return $store
  } else {
  ns_log Debug namespace ::bug::$store does not exist
  }
 
  }
 
  # Comment out one of these and things work fine:
  ns_atclose ::bug::atClose $store
  #ns_atclose ::bug::atClose $store
 
 
  ns_return 200 text/plain ns_atclose bug
 
  -
 
  The bug doesn't show up under all conditions. If the namespace exists, or
  had existed and was deleted, things work as expected. Also, even if the
  namespace never existed, if ns_atclose is only called once, things work
  as expected.
 
  However, if the namespace to be deleted never existed, and ns_atclose is
  called twice with the same args, none of the ns_log Debug statements
  print, and the crash occurs. (But the page is returned)
 
  Not sure what is the cause.
 
  tom jackson
 
  On Friday 03 November 2006 10:31, Alex wrote:
   Oh, well
  
   so I guess it was too early to celebrate. Now I am getting the same
   crashes again, even without exit command in the tcl code executed in
   thread.
  
   Seems to me that the same problem now discussed in
   bug 1589968
   https://sourceforge.net/tracker/?func=detailatid=103152aid=1589968gr
  ou p_ id=3152
  
   and
  
   bug 1582671
   http://sourceforge.net/tracker/?func=detailatid=110894aid=1582671gro
  up _i d=10894
  
  
   Thanks,
   ~ Alex.
  
   On 11/1/06, Alex [EMAIL PROTECTED] wrote:
Zoran, Jim
   
thanks very much for suggestions!
I think I figured it out.
The code which was executing in the thread concluded with exit tcl
command. I got it replaced with return and it seems not to be
crashing anymore.
   
However, it would be probably a good idea to disable/rename exit
for the code executed in threads created by ns_thread. Not sure if
this shall be submitted as an enhancement-level bug.
   
Thanks,
~ Alex.
   
On 11/1/06, Alex [EMAIL PROTECTED] wrote:
 Jim,

 I tried in on the command line, seems to be my case :)

 However, I run aolserver on debian, via /etc/init.d/aolserver,
 Which basically invokes /usr/lib/aolserver4/bin/nsd.
 How do I make it use nstclsh instead of tclsh ?
 I don't see any options for that.

 Thanks,
 ~ Alex.

 On 11/1/06, Jim Davidson [EMAIL PROTECTED] wrote:
  Hi,
 
  I