Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread 'Jesus' Jeff Rogers
I had problems starting at the exact same time but on Solaris, where 
they manifested as a EINVAL return from pthread_cond_tomedwait.  After a 
day of tracing the problem with debug builds and working with my 
sysadmin to track what changed (of course, nothing had)  I cam to the 
same 1 billion second issue.


Which coincidentally is the expiry time (MaxOpen and MaxIdle) set on my 
database connections.  

My system is ACS-derived, so I wouldn't be surprised if these database 
settings are common in other ACS-derived systems.


The only bug is that Ns_CondTimedWait doesn't do any wraparound on the 
time parameter.  All the same, I've been enjoying telling people that I 
hit my first y2038 bug.


-J

Bas Scheffers wrote:

On 17 May 2006, at 21:34, Dossy Shiobara wrote:

Dave Siktberg seems to have narrowed it down to 2006-05-12 21:25.
In what timezone? It sound like that could equate to Sat May 13 
02:27:28 BST 2006, or 1147483648 seconds since epoch, which makes it 
*exactly* 1,000,000,000 seconds until expiry of 32 bit time. 
Coincidence? Seems too strange as to a computer that is not a nice 
round number.


I wonder what Dan Brown would have to say about it! :)

Bas.


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to 
[EMAIL PROTECTED] with the
body of SIGNOFF AOLSERVER in the email message. You can leave the 
Subject: field of your email blank.



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread Stan Kaufman

'Jesus' Jeff Rogers wrote:
I had problems starting at the exact same time but on Solaris, where 
they manifested as a EINVAL return from pthread_cond_tomedwait.  After 
a day of tracing the problem with debug builds and working with my 
sysadmin to track what changed (of course, nothing had)  I cam to the 
same 1 billion second issue.


Which coincidentally is the expiry time (MaxOpen and MaxIdle) set on 
my database connections. 
My system is ACS-derived, so I wouldn't be surprised if these database 
settings are common in other ACS-derived systems.



What do you think is the reason that not all systems encounter this 1B 
second issue? The passage of time is the one factor inevitably shared by 
every system running aolserver, yet not every system barfs in the same 
fashion. Why?



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread Janine Sisk

On May 19, 2006, at 1:04 PM, 'Jesus' Jeff Rogers wrote:

The only bug is that Ns_CondTimedWait doesn't do any wraparound on  
the time parameter.  All the same, I've been enjoying telling  
people that I hit my first y2038 bug.


So are you saying you've fixed it, or just that you've narrowed it  
down to this?  If you've fixed it, do tell! :)


janine


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread Dossy Shiobara
On 2006.05.19, 'Jesus' Jeff Rogers [EMAIL PROTECTED] wrote:
 Which coincidentally is the expiry time (MaxOpen and MaxIdle) set on my 
 database connections.  

Ahahaha!  Yes, in the back of my head I always wondered (and never
bothered to compute) when that silly value of 10^9 would bite someone.
Guess it's May 12, 2006.  :-)

Can everyone who's affected go and change MaxIdle and MaxOpen and
anything else that's a time-in-seconds parameter and lop off a few zeros
and see if that makes the problem go away?

This is too funny.  I'm still chuckling ... :-)  Thanks for figuring it
out!

-- Dossy

-- 
Dossy Shiobara  | [EMAIL PROTECTED] | http://dossy.org/
Panoptic Computer Network   | http://panoptic.com/
  He realized the fastest way to change is to laugh at your own
folly -- then you can let go and quickly move on. (p. 70)


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread 'Jesus' Jeff Rogers
I fixed it by simply changing my MaxOpen/MaxIdle settings to 0 which 
is interpreted as forever which is probably what the original one 
BILLION seconds was undoubtedly intended to be.


It probably wouldn't hurt for Ns_AdjTime (in nsthread/time.c) to check 
for negative seconds and have it fail in a nicer manner, but that 
problem won't affect anyone for years :)


-J

Janine Sisk wrote:

On May 19, 2006, at 1:04 PM, 'Jesus' Jeff Rogers wrote:

The only bug is that Ns_CondTimedWait doesn't do any wraparound on the 
time parameter.  All the same, I've been enjoying telling people that 
I hit my first y2038 bug.


So are you saying you've fixed it, or just that you've narrowed it down 
to this?  If you've fixed it, do tell! :)


janine



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread Zachary Shaw

Ahhh good detective work.

Problem solved.

Thanks!!!


Zach Shaw
Web Developer, Library and Technology Services
Brandeis University
[EMAIL PROTECTED]
781-736-4206


On May 19, 2006, at 4:32 PM, 'Jesus' Jeff Rogers wrote:

I fixed it by simply changing my MaxOpen/MaxIdle settings to 0  
which is interpreted as forever which is probably what the  
original one BILLION seconds was undoubtedly intended to be.


It probably wouldn't hurt for Ns_AdjTime (in nsthread/time.c) to  
check for negative seconds and have it fail in a nicer manner, but  
that problem won't affect anyone for years :)


-J

Janine Sisk wrote:

On May 19, 2006, at 1:04 PM, 'Jesus' Jeff Rogers wrote:
The only bug is that Ns_CondTimedWait doesn't do any wraparound  
on the time parameter.  All the same, I've been enjoying telling  
people that I hit my first y2038 bug.
So are you saying you've fixed it, or just that you've narrowed it  
down to this?  If you've fixed it, do tell! :)

janine



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to  
[EMAIL PROTECTED] with the
body of SIGNOFF AOLSERVER in the email message. You can leave the  
Subject: field of your email blank.



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread 'Jesus' Jeff Rogers

Stan Kaufman wrote:

Which coincidentally is the expiry time (MaxOpen and MaxIdle) set on 
my database connections. My system is ACS-derived, so I wouldn't be 
surprised if these database settings are common in other ACS-derived 
systems.



What do you think is the reason that not all systems encounter this 1B 
second issue? The passage of time is the one factor inevitably shared by 
every system running aolserver, yet not every system barfs in the same 
fashion. Why?


Simple, because it's a config file setting, not anything to do with the 
underlying system.  If your config file has


[ns/db/pool/main]
MaxOpen=10
MaxIdle=10

(which I think was done to work around some ancient bug in an ancient 
version of the nsoracle driver) then you get the problem.  If your 
timeouts are more reasonable or 0 to explicitly specify never timeout,

then no problem.

It took me longer than it should have to track down this problem, since 
it was happenning immediately after the database connections were 
started, and other servers with no database connections (like the 
keepalive server) had no problems;  we of course thought there was some 
database issue but didn't think about looking at the settings or that 
the unix daemon dogging your heels for removing that comment would stop 
so soon (obscure humor there...)


I imagine on Linux it manifests differently; on Solaris I got the EINVAL 
return from pthread_cond_timedwait (of course it isn't documented that 
this can mean a bad time, it usually means a bad pointer) but on linux 
with a different pthreads implementation it could result in locking up 
or just never returning (which presumably would result in all timed 
events after it in the queue which us sorted by time getting blocked and 
not running, as others reported.


-J


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread Michael A. Cleverly

On 5/19/06, 'Jesus' Jeff Rogers [EMAIL PROTECTED] wrote:


I imagine on Linux it manifests differently; on Solaris I got the EINVAL
return from pthread_cond_timedwait (of course it isn't documented that
this can mean a bad time, it usually means a bad pointer) but on linux
with a different pthreads implementation it could result in locking up
or just never returning (which presumably would result in all timed
events after it in the queue which us sorted by time getting blocked and
not running, as others reported.


No EINVAL on Linux, but first visible symptom was
acs_messaging_process_queue not running every fifteen minutes
anymore...

Changing the MaxIdle/MaxOpen settings  restarting AOLserver worked
like a charm.

Michael


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread dhogaza
 On 2006.05.19, 'Jesus' Jeff Rogers [EMAIL PROTECTED] wrote:

 Ahahaha!  Yes, in the back of my head I always wondered (and never
 bothered to compute) when that silly value of 10^9 would bite someone.
 Guess it's May 12, 2006.  :-)

 Can everyone who's affected go and change MaxIdle and MaxOpen and
 anything else that's a time-in-seconds parameter and lop off a few zeros
 and see if that makes the problem go away?

 This is too funny.  I'm still chuckling ... :-)  Thanks for figuring it
 out!

This is the number of seconds the handle is kept alive after it is first
created.  Not the number of seconds the handle is kept alive from the
beginning of some AOLserver magic zero moment.

Aren't people having this problem after rebooting, too?

I admit I've not been paying as close attention to this thread as I might,
as I've not used 3.3 in years.


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread dhogaza
 On 2006.05.19, 'Jesus' Jeff Rogers [EMAIL PROTECTED] wrote:

 I admit I've not been paying as close attention to this thread as I might,
 as I've not used 3.3 in years.

Next time remind me that this is a good reason to think before posting,
folks!


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread Andrew Piskorski
On Fri, May 19, 2006 at 02:30:54PM -0700, dhogaza@PACIFIER.COM wrote:
  (which I think was done to work around some ancient bug in an ancient
  version of the nsoracle driver) then you get the problem.
 
 I think the problem was in the oracle library (OCI), but it's been a long,
 long time.

Yep.  For those interested in ancient trivia, I think it was TWO bugs,
one in the Oracle driver and/or OCI libraries (most likely OCI), and
one in AOLserver.  I think the workaround dates from before I ever
used AOLserver, but I have these old comments in my AOLserver config
file:

# MaxIdle and MaxOpen: 
# 
# Settings these to 10 is a historical bug workaround.  Could 
# now probably set this to some normal number, or set to 0 to disable 
# entirely.  E.g., in this thread Rob Mayoff [EMAIL PROTECTED] says: 
# 
# http://www.arsdigita.com/bboard/q-and-a-fetch-msg?msg%5fid=000Ibq 
# 
#   It is a bug workaround. Many Linux users (including me) saw that 
#   when AOLserver tried to close a database connection, it would hang 
#   in the Oracle driver. So people started setting and MaxIdle to a 
#   very large number to keep connections from closing. You can also set 
#   them to zero, but at the time the bug was discovered, AOLserver had 
#   a bug that prevented you from setting them to zero. 
# 
#   I believe the bug was also seen, very rarely, on Solaris. 
# 
#   Curtis Galloway managed to get Oracle to investigate. They suggested 
#   to workarounds: use IPC or TCP to connect (which is what I do on my 
#   system), or set bequeath_detach=yes in sqlnet.ora. 
# 
# [EMAIL PROTECTED], 2002/01/10 14:22 EST 

-- 
Andrew Piskorski [EMAIL PROTECTED]
http://www.piskorski.com/


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread Stan Kaufman

'Jesus' Jeff Rogers wrote:
Simple, because it's a config file setting, not anything to do with 
the underlying system.  If your config file has


[ns/db/pool/main]
MaxOpen=10
MaxIdle=10

(which I think was done to work around some ancient bug in an ancient 
version of the nsoracle driver) then you get the problem.  If your 
timeouts are more reasonable or 0 to explicitly specify never timeout, 
then no problem.


Ah. For whatever reason, in my config files MaxOpen and MaxIdle were 
commented out; that must be why my systems didn't encounter this problem.


Not defining these two vars must have been the way the 3.2.5 config file 
was distributed back in the day, as this is not something I would have 
thought to do. Leaving them undefined seems to have no ill effect.


So are people defining them to 0 or simply undefining them?


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread Stan Kaufman

Stan Kaufman wrote:
Not defining these two vars must have been the way the 3.2.5 config 
file was distributed back in the day


Yup, that was the case: http://openacs.org/doc/openacs-3/nsd.txt

Interesting that that inoculated OpenACS 3.2.5 systems from this problem.


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread dhogaza
 On Fri, May 19, 2006 at 02:30:54PM -0700, dhogaza@PACIFIER.COM wrote:
 #   You can also set
 #   them to zero, but at the time the bug was discovered, AOLserver had
 #   a bug that prevented you from setting them to zero.

Yeah, I knew there was a reason a big number rather than zero was chosen,
too, but couldn't remember why.

How funny.


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread Dossy Shiobara
On 2006.05.19, Stan Kaufman [EMAIL PROTECTED] wrote:
 Which coincidentally is the expiry time (MaxOpen and MaxIdle) set on 
 my database connections. 
 My system is ACS-derived, so I wouldn't be surprised if these database 
 settings are common in other ACS-derived systems.
 
 What do you think is the reason that not all systems encounter this 1B 
 second issue? The passage of time is the one factor inevitably shared by 
 every system running aolserver, yet not every system barfs in the same 
 fashion. Why?

Generally, the only time I've seen people config the MaxOpen and MaxIdle
to 1B seconds is when they're using an ACS or OpenACS recommended
configuration.

It might be a good idea for the OpenACS folks to edit/update their
documentation and sample configs to correct this, as well ... although
it'll never be May 13, 2006 ever again, so maybe it's a non-issue.  :-)

-- Dossy

-- 
Dossy Shiobara  | [EMAIL PROTECTED] | http://dossy.org/
Panoptic Computer Network   | http://panoptic.com/
  He realized the fastest way to change is to laugh at your own
folly -- then you can let go and quickly move on. (p. 70)


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread Dossy Shiobara
On 2006.05.19, dhogaza@PACIFIER.COM dhogaza@PACIFIER.COM wrote:
 The billion seconds added to the current time when the database handle's
 created is causing the problem, with Solaris being nice enough to toss an
 error, Linux just screwing up.

To be fair, Linux isn't screwing up -- the time-to-sleep being
passed to pthread_cond_wait is a *real long time*.  Solaris must have
some limit as to how long it'll let a thread condwait for, where Linux
doesn't ... so Solaris returns an error while Linux just ... waits.  :-)

Am I the only one who finds this funny?  :-)

-- Dossy

-- 
Dossy Shiobara  | [EMAIL PROTECTED] | http://dossy.org/
Panoptic Computer Network   | http://panoptic.com/
  He realized the fastest way to change is to laugh at your own
folly -- then you can let go and quickly move on. (p. 70)


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread Dossy Shiobara
On 2006.05.19, dhogaza@PACIFIER.COM dhogaza@PACIFIER.COM wrote:
  On Fri, May 19, 2006 at 02:30:54PM -0700, dhogaza@PACIFIER.COM wrote:
  #   You can also set
  #   them to zero, but at the time the bug was discovered, AOLserver had
  #   a bug that prevented you from setting them to zero.
 
 Yeah, I knew there was a reason a big number rather than zero was chosen,
 too, but couldn't remember why.
 
 How funny.

If only folks chose 10^8 instead of 10^9 ... it would have been 1157
days or 3.1 years worth of MaxOpen/MaxIdle, and we wouldn't have
encountered this weird thread hanging bug until Nov 18, 2034.  :-)

-- Dossy

-- 
Dossy Shiobara  | [EMAIL PROTECTED] | http://dossy.org/
Panoptic Computer Network   | http://panoptic.com/
  He realized the fastest way to change is to laugh at your own
folly -- then you can let go and quickly move on. (p. 70)


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread Stan Kaufman

Dossy Shiobara wrote:

Generally, the only time I've seen people config the MaxOpen and MaxIdle
to 1B seconds is when they're using an ACS or OpenACS recommended
configuration.

It might be a good idea for the OpenACS folks to edit/update their
documentation and sample configs to correct this, as well ... although
it'll never be May 13, 2006 ever again, so maybe it's a non-issue.  :-)
  



Why doesn't AOLServer v4.x have problems with MaxOpen and MaxIdle set to 
1B -- as they are in OpenACS 5.x configs?


Do MaxOpen and MaxIdle even need to be defined at all? They weren't in 
OpenACS 3.2.5/AOLServer 3.3.1+ad13; what's different with the current stack?



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread dhogaza
 On 2006.05.19, Stan Kaufman [EMAIL PROTECTED] wrote:

 It might be a good idea for the OpenACS folks to edit/update their
 documentation and sample configs to correct this, as well ... although
 it'll never be May 13, 2006 ever again, so maybe it's a non-issue.  :-)

Now that zero works, we'll probably switch to that.  I know it's worked
for years, sometimes it takes us a while to catch up :)


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread dhogaza
 If only folks chose 10^8 instead of 10^9 ... it would have been 1157
 days or 3.1 years worth of MaxOpen/MaxIdle, and we wouldn't have
 encountered this weird thread hanging bug until Nov 18, 2034.  :-)

 -- Dossy

Shit like this happens when you know your software platform's reliable and
may not crash for decades! :)


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-19 Thread dhogaza
 If only folks chose 10^8 instead of 10^9 ... it would have been 1157
 days or 3.1 years worth of MaxOpen/MaxIdle, and we wouldn't have
 encountered this weird thread hanging bug until Nov 18, 2034.  :-)

 -- Dossy

Shit like this happens when you know your software platform's reliable! :)


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-18 Thread Zachary Shaw


1) ns_info patchlevel


3.3.1+ad13
8.3.2



2) uname -a


Linux my.brandeis.edu 2.4.9-e.68smp #1 SMP Thu Jan 19 18:38:50 EST  
2006 i686 unknown



3) glibc version


[EMAIL PROTECTED] root]# rpm -qa | grep glibc
compat-glibc-6.2-2.1.3.2
glibc-common-2.2.4-32.23
glibc-2.2.4-32.23
glibc-devel-2.2.4-32.23


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-18 Thread Stan Kaufman

Cynthia Kiser wrote:

Quoting Gustaf Neumann [EMAIL PROTECTED]:
  

It seems, as if all problems are in tcl 8.3.*.


But just a data point from the not happening here side. I have

info patchlevel 8.3.2
  
FWIW, on my boxes where aolserver appears to function correctly, the tcl 
in the aolserver lib is 8.3:


/usr/local/aolserver/lib/tcl8.3/

On one box I have tcl 8.3.3 installed generally, and on another 8.4.9 
(as revealed by checking [info patchlevel] from tclsh). But I presume 
that in both cases, aolserver is still using the tcl in its lib 
directory, correct? Since [ns_info patchlevel] isn't implemented in 
3.3.1+ad13, is there some way to tell for sure which tcl aolserver is using?


Anyway, if it is the case that aolserver is using its own version of 
tcl, then the tcl 8.3.x is the problem theory wouldn't appear to stand.



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-18 Thread Andrew Piskorski
On Thu, May 18, 2006 at 10:03:59AM -0700, Stan Kaufman wrote:

 On one box I have tcl 8.3.3 installed generally, and on another 8.4.9 
 (as revealed by checking [info patchlevel] from tclsh). But I presume 
 that in both cases, aolserver is still using the tcl in its lib 
 directory, correct? Since [ns_info patchlevel] isn't implemented in 
 3.3.1+ad13, is there some way to tell for sure which tcl aolserver is using?

If all you want to know is the Tcl version AOLserver is using, it's
trivially easy, just use the Tcl info patchlevel command from inside
AOLserver.

 Anyway, if it is the case that aolserver is using its own version of 
 tcl, then the tcl 8.3.x is the problem theory wouldn't appear to stand.

All versions of AOLserver 3.3 REQUIRED and shipped with their own
special version of Tcl 8.3.x.  So unless you took very special steps
to make it do so, it's very unlikely that your AOLserver is using any
other version of Tcl.  I tried once to make AOLserver 3.3+ad13 use a
newer version of Tcl - I failed, and I never heard of anyone else
doing it either.

-- 
Andrew Piskorski [EMAIL PROTECTED]
http://www.piskorski.com/


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-18 Thread Zachary Shaw


FWIW, on my boxes where aolserver appears to function correctly,  
the tcl in the aolserver lib is 8.3:




I have a what seems reliable test see if aolserver is working properly.

copy off your .ini file
modify any Library entries and point them to a new location.
in the new location put a .tcl file that schedules a number of  
ns_logs some with threading  (I suggest having the procs run ever  
second or 2 or 3).

also kill any lines that have an auxconfigdir

bring up the new server using your new .ini file

if the logs stop getting written to you know you have a problem.


Zach Shaw
Web Developer, Library and Technology Services
Brandeis University
[EMAIL PROTECTED]
781-736-4206


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-18 Thread Stan Kaufman

Andrew Piskorski wrote:

If all you want to know is the Tcl version AOLserver is using, it's
trivially easy, just use the Tcl info patchlevel command from inside
AOLserver.
Right; duh. Thanks. So, the version of tcl my 3.3.1+ad13 sites are using 
is 8.3.2, and they appear to reboot without the VM and scheduled proc 
problems. FWIW.



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-18 Thread dhogaza
 All versions of AOLserver 3.3 REQUIRED and shipped with their own
 special version of Tcl 8.3.x.  So unless you took very special steps
 to make it do so, it's very unlikely that your AOLserver is using any
 other version of Tcl.  I tried once to make AOLserver 3.3+ad13 use a
 newer version of Tcl - I failed, and I never heard of anyone else
 doing it either.

I did some googling and it appears you're right.  People did get 8.4
working with AOLserver 3.4, but not 3.3 ...


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-17 Thread Janine Sisk

1) ns_info patchlevel


3.3 apparently didn't have patchlevel, as that gave me an error.  The  
output of ns_version is 3.3.1+ad13



2) uname -a


Linux x.furfly.com 2.6.9-34.ELsmp #1 SMP Fri Feb 24 16:54:53 EST  
2006 i686 i686 i386 GNU/Linux



3) glibc version


$ rpm -qa | grep glibc
glibc-kernheaders-2.4-9.1.98.EL
glibc-common-2.3.4-2.19
glibc-devel-2.3.4-2.19
glibc-2.3.4-2.19
glibc-headers-2.3.4-2.19

On May 17, 2006, at 1:34 PM, Dossy Shiobara wrote:


On 2006.05.17, Zachary Shaw [EMAIL PROTECTED] wrote:

We're experiancing a similar issue at Brandeis University, but we get
no error, our scheduled procs just hang. [...] we're running  
aolserver

3.3.1 ad13 [...] if I set the system date to may 12th or earilier all
the procs will run.  otherwise they run for a little then stop.

looking at the straces the difference appears to be in how the
nanosleep is set for the pids.

before may 13th nanosleep was in the form
[pid   614] nanosleep({0, 34478},  unfinished ...

after the 12th there were nanosleeps in the form
[pid   614] nanosleep({9, 934211000},  unfinished ...


Dave Siktberg seems to have narrowed it down to 2006-05-12 21:25.

What's interesting is I'm running AOLserver 4.0.10 on x86/Linux  
2.6.15.6
with glibc6 2.3.5 with no OpenACS and all my scheduled procs are  
firing

just fine.

Can we get everyone who's experiencing this problem to provide a few
things:

1) ns_info patchlevel
2) uname -a
3) glibc version

I'm betting this is an older Linux or LinuxThreads or glibc  
problem.  I

could be wrong, of course, but gathering this info will help to figure
it out.

-- Dossy

--
Dossy Shiobara  | [EMAIL PROTECTED] | http://dossy.org/
Panoptic Computer Network   | http://panoptic.com/
  He realized the fastest way to change is to laugh at your own
folly -- then you can let go and quickly move on. (p. 70)


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to  
[EMAIL PROTECTED] with the
body of SIGNOFF AOLSERVER in the email message. You can leave the  
Subject: field of your email blank.





--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-17 Thread Titi Ala'ilima
We just decided to move everything left on 3.3ad13 to 4.0, but to help 
those who need it:



Can we get everyone who's experiencing this problem to provide a few
things:

1) ns_info patchlevel
  

I think you mean info patchlevel

I've got 8.3.2

2) uname -a
  
Linux servername 2.4.21-4.EL #1 Fri Oct 3 18:13:58 EDT 2003 i686 i686 
i386 GNU/Linux



3) glibc version
  

glibc-2.3.2-95.3


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-17 Thread Janine Sisk

My info patchlevel is also 8.3.2.

janine

On May 17, 2006, at 2:12 PM, Titi Ala'ilima wrote:

We just decided to move everything left on 3.3ad13 to 4.0, but to  
help those who need it:



Can we get everyone who's experiencing this problem to provide a few
things:

1) ns_info patchlevel


I think you mean info patchlevel

I've got 8.3.2

2) uname -a

Linux servername 2.4.21-4.EL #1 Fri Oct 3 18:13:58 EDT 2003 i686  
i686 i386 GNU/Linux



3) glibc version


glibc-2.3.2-95.3


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to  
[EMAIL PROTECTED] with the
body of SIGNOFF AOLSERVER in the email message. You can leave the  
Subject: field of your email blank.





--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-17 Thread Dossy Shiobara
On 2006.05.17, Titi Ala'ilima [EMAIL PROTECTED] wrote:
 We just decided to move everything left on 3.3ad13 to 4.0, but to help 
 those who need it:

So, you're seeing this problem even on AOLserver 4.0?  What version of
OpenACS?

 Can we get everyone who's experiencing this problem to provide a few
 things:
 
 1) ns_info patchlevel
   
 I think you mean info patchlevel
 
 I've got 8.3.2

No, I meant ns_info patchlevel -- to get the full version of AOLserver
that's running -- but yes, I should have asked for info patchlevel
too, to find out what version of Tcl is being used.

Janine, could you give us your info patchlevel too?  Same with
everyone else who is seing this problem and is reporting information.
Thanks.

-- Dossy

-- 
Dossy Shiobara  | [EMAIL PROTECTED] | http://dossy.org/
Panoptic Computer Network   | http://panoptic.com/
  He realized the fastest way to change is to laugh at your own
folly -- then you can let go and quickly move on. (p. 70)


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-17 Thread Janine Sisk

On May 17, 2006, at 2:35 PM, Dossy Shiobara wrote:


On 2006.05.17, Titi Ala'ilima [EMAIL PROTECTED] wrote:
Janine, could you give us your info patchlevel too?  Same with
everyone else who is seing this problem and is reporting information.


8.3.2 also.

janine


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-17 Thread Stan Kaufman

Dossy Shiobara wrote:
No, I meant ns_info patchlevel -- to get the full version of 
AOLserverthat's running -- but yes, I should have asked for info 
patchleveltoo, to find out what version of Tcl is being used.


patchlevel appears not to be a switch to ns_info in 
AOLserver/3.3.1+ad13 (as Janine pointed out):


[17/May/2006:15:08:53][14372.11231392][-conn5-] Error: unknown command 
patchlevel:  should be address, argv0, builddate, callbacks, config, 
hostname, label, locks, log, name, pageroot, pid, platform, scheduled, 
server, sockcallbacks, tag, tcllib, threads, version, or winnt

   while executing
ns_info patchlevel

In any case, here is info about a system that appears *not* to have this 
trouble:


% info patchlevel
8.4.9

ns_info version:
3.3.1+ad13

uname:
Linux x 2.4.27 #1 Mon Jul 4 21:39:37 PDT 2005 i686 GNU/Linux
(Debian sarge)

glibc:
libstdc++2.10-glibc2.2


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-17 Thread Guan Yang

On 17 May  2006, at 23:35 , Dossy Shiobara wrote:


On 2006.05.17, Titi Ala'ilima [EMAIL PROTECTED] wrote:
We just decided to move everything left on 3.3ad13 to 4.0, but to  
help

those who need it:


So, you're seeing this problem even on AOLserver 4.0?  What version of
OpenACS?


I don't think anyone has seen this problem on AOLserver 4.0.

Guan


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-17 Thread Bas Scheffers

On 17 May 2006, at 21:34, Dossy Shiobara wrote:

Dave Siktberg seems to have narrowed it down to 2006-05-12 21:25.
In what timezone? It sound like that could equate to Sat May 13  
02:27:28 BST 2006, or 1147483648 seconds since epoch, which makes it  
*exactly* 1,000,000,000 seconds until expiry of 32 bit time.  
Coincidence? Seems too strange as to a computer that is not a nice  
round number.


I wonder what Dan Brown would have to say about it! :)

Bas.


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-17 Thread Dave Siktberg
SERVER #1 exhibiting the problem

% info patchlevel
8.3.3

$ rpm -qa | grep glibc
glibc-2.2.4-13
glibc-common-2.2.4-13
glibc-devel-2.2.4-13

$ uname -a
Linux opus 2.4.7-10 #1 Thu Sep 6 17:27:27 EDT 2001 i686 unknown

ns_info version: 3.3.1+ad13 


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-17 Thread dhogaza
 So, you're seeing this problem even on AOLserver 4.0?  What version of
 OpenACS?

 I don't think anyone has seen this problem on AOLserver 4.0.

Right, I think he meant they're fleeing to AOLserver 4.x to get rid of the
problem, and was just posting his current system info so others might be
able to figure out why 3.x isn't working.


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-17 Thread Gustaf Neumann

It seems, as if all problems are in tcl 8.3.*.
it should be possible to compile aolserver 3.* with e.g. tcl 8.4.13. 
Seems worth a try.


-gustaf

Dave Siktberg schrieb:

SERVER #1 exhibiting the problem

% info patchlevel
8.3.3

$ rpm -qa | grep glibc
glibc-2.2.4-13
glibc-common-2.2.4-13
glibc-devel-2.2.4-13

$ uname -a
Linux opus 2.4.7-10 #1 Thu Sep 6 17:27:27 EDT 2001 i686 unknown

ns_info version: 3.3.1+ad13 



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.
  



--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-17 Thread Dave Siktberg
Great detective work!  I assumed the watershed time would have this kind of
characteristic.
My timezone is Eastern Daylight, BTW.

  Dave Siktberg seems to have narrowed it down to 2006-05-12 21:25.
 In what timezone? It sound like that could equate to Sat May 13
 02:27:28 BST 2006, or 1147483648 seconds since epoch, which makes it
 *exactly* 1,000,000,000 seconds until expiry of 32 bit time.
 Coincidence? Seems too strange as to a computer that is not a nice
 round number.


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-17 Thread Titi Ala'ilima
Sorry, I meant instead of waiting for a fix to the problem, we're moving sites 
to 4.0.  But I contributed this info to add to the body of data on the 3.x 
problem.

--- [EMAIL PROTECTED] wrote:

From: Dossy Shiobara [EMAIL PROTECTED]
To: AOLSERVER@LISTSERV.AOL.COM
Subject: Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird 
memory leak problem in AOLserver 3.4.2/3.x)
Date: Wed, 17 May 2006 17:35:07 -0400

On 2006.05.17, Titi Ala'ilima [EMAIL PROTECTED] wrote:
 We just decided to move everything left on 3.3ad13 to 4.0, but to help 
 those who need it:

So, you're seeing this problem even on AOLserver 4.0?  What version of
OpenACS?

 Can we get everyone who's experiencing this problem to provide a few
 things:
 
 1) ns_info patchlevel
   
 I think you mean info patchlevel
 
 I've got 8.3.2

No, I meant ns_info patchlevel -- to get the full version of AOLserver
that's running -- but yes, I should have asked for info patchlevel
too, to find out what version of Tcl is being used.

Janine, could you give us your info patchlevel too?  Same with
everyone else who is seing this problem and is reporting information.
Thanks.

-- Dossy

-- 
Dossy Shiobara  | [EMAIL PROTECTED] | http://dossy.org/
Panoptic Computer Network   | http://panoptic.com/
  He realized the fastest way to change is to laugh at your own
folly -- then you can let go and quickly move on. (p. 70)


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.


Re: [AOLSERVER] Something wrong after 2006-05-12 21:25 (was Re: Weird memory leak problem in AOLserver 3.4.2/3.x)

2006-05-17 Thread Cynthia Kiser
Quoting Gustaf Neumann [EMAIL PROTECTED]:
 It seems, as if all problems are in tcl 8.3.*.
 it should be possible to compile aolserver 3.* with e.g. tcl 8.4.13. 

But just a data point from the not happening here side. I have
AOLserver 3.3.1+ad13 servers that are all running fine - both the ones
that have been long running and those that get HUPed every morning. 

info patchlevel 8.3.2

AOLserver/3.3.1+ad13 

Code is mostly based on ACS 3.4.10

Linux 2.4.20-29.7.progeny.9bigmem #1 SMP Fri Jan 7 18:08:47 EST 2005
i686 unknown (RedHat 7.3)

$ rpm -qa | grep glibc
compat-glibc-6.2-2.1.3.2
glibc-2.2.5-44.progeny.1
glibc-kernheaders-2.4-7.16
glibc-common-2.2.5-44.progeny.1
glibc-devel-2.2.5-44.progeny.1

My guess was that there was something about the length or content of
the scheduled procs list, but the server that was just running a
single file worth of scheduled procs (or rather not running it)
demolished that theory. 

-- 
Cynthia Kiser
[EMAIL PROTECTED]


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to [EMAIL PROTECTED] 
with the
body of SIGNOFF AOLSERVER in the email message. You can leave the Subject: 
field of your email blank.