Re: [Zope-dev] Zope 2.4.4b1 dumps core

2002-02-15 Thread Dieter Maurer

Jeremy Hylton writes:
  ... memory corruption ...
  Does the community have any Zen about how to narrow down bugs like
  this?
I once used purify to analyse this type of problem.

I was not easy: purify slowed Zope down by one to two
magnitudes. It has been only feasible because the mean time
between two crashes have been only a few minutes (which turned
into some half hours when run under purify).


Dieter

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Zope 2.4.4b1 dumps core

2002-02-14 Thread Jeremy Hylton

 AJ == Andreas Jung [EMAIL PROTECTED] writes:

  AJ Does this problem persist when you remove the 3rd-party products
  AJ ?  Are you running Zope with enabled garbage collector ?

Just a reminder that you *should* be running with the garbage
collector enabled.  We are aware of no current bugs in the garbage
collector, which has been in use since Python 2.0.

If you see a crash and the gdb stack trace points to the garbage
collector, it is almost surely a sign of memory corruption caused by
bugs elsewhere in Python or in a C extension.  You can prevent those
crashes by disabling the garbage collector, but you are more likely to
get bizarre errors or storage corruptions since the interpreter memory
is corrupted.

Jeremy



___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Zope 2.4.4b1 dumps core

2002-02-14 Thread Jeremy Hylton

Python 2.1.2 has some extra safety checks that will immediately detect
the stack overflow bugs that caused problems with PythonScripts before
2.4.4.  The fact that you're seeing crashes in the garbage collector
and not assertion failures in Python/ceval.c makes me think the
problem isn't with the PythonScripts.  I wouldn't rule it out
completely, but it seems unlikely.

Does the community have any Zen about how to narrow down bugs like
this?  It seems a daunting task, in general, because it's not obvious
what particular request exercises the bug, and the error report
doesn't come until long after the bug occurs.

I wonder if you could crank up the garbage collection frequency --
either use gc.set_threshold() to see the threshold very low or add an
explicit call to gc.collect() after each request.  If you're lucky,
this would cause the bug to be detect -- like right after the request
that exercised the bug.

Where exactly would you put the gc.collect() call to make this work?

Jeremy




___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Zope 2.4.4b1 dumps core

2002-02-13 Thread Leonardo Rochael Almeida

Hi,

As I mentioned in this list before, I'm also getting the segfaults in a
Zope that depends heavily on PythonScripts.

The only C compiled Product that is used is ZMySQLDA with the last
versions of everything I could find, and I doesn't seem to be the cause
of the problem, since disabling gc fixes the problem (just like it fixed
with 2.4.3) but causes a leak that requires us to restart the ZEO Client
(which is what is segfaulting. The ZEO server runs without a glitch)
every 6AM.

I have a feeling that it's crashing less frequently than Zope 2.4.3, but
I cannot confirm this. It's crashing every 20m aproximately, when gc is
enabled.

I have run out of options as well. We must keep the server under
surveillance at all times because sometimes the morning restart isn't
enough to contain the leak before the machine starts swapping.

Cheers, Leo

On Wed, 2002-02-13 at 16:36, Dario Lopez-Kästen wrote:
 Hi all!
 
 I am sorry to report that Zope 2.4.4b1 dumps core. I have filed a bug issue
 to the collector, and I am also seeking advice on possible ways out of my
 misery :)
 
 I also have attached to the collector issue two tracebacks from two
 different core dumps.
 
 Here is the setup:
 
 Zope 2.4.4b1, source release
 Python 2.1.2 source w/o pymalloc
 RedhatLinux 7.2, kernel 2.4.7-10 on an P4 machine
 
 We use the following products
 
 Formulator 1.0 .1- sligthly modified by one consultant
 LocalFS, latest
 Transparent folders, latest
 ReplaceSupport, latest
 Stripogram, latest
 DCO2, latest from CVS
 
 We have a lot of Oracle operations going on on each request, and we use a
 *lot' of PythionScripts.
 
 Any ideas on what might be wrong?
 
 /dario - very, very weary of these core dumps.
 
 - 
 Dario Lopez-Kästen Systems Developer  Chalmers Univ. of Technology
 [EMAIL PROTECTED]  ICQ will yield no hitsIT Systems  Services
 
 
 
 ___
 Zope-Dev maillist  -  [EMAIL PROTECTED]
 http://lists.zope.org/mailman/listinfo/zope-dev
 **  No cross posts or HTML encoding!  **
 (Related lists - 
  http://lists.zope.org/mailman/listinfo/zope-announce
  http://lists.zope.org/mailman/listinfo/zope )
 
 
-- 
Ideas don't stay in some minds very long because they don't like
solitary confinement.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists -
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Zope 2.4.4b1 dumps core

2002-02-13 Thread Chris McDonough

This won't help the crashing problem, but for the leak problem you may want
to consider using AutoLance:
http://www.zope.org/Members/mcdonc/Products/AutoLance

- Original Message -
From: Leonardo Rochael Almeida [EMAIL PROTECTED]
To: Zope Developers list [EMAIL PROTECTED]
Sent: Wednesday, February 13, 2002 2:38 PM
Subject: Re: [Zope-dev] Zope 2.4.4b1 dumps core


Hi,

As I mentioned in this list before, I'm also getting the segfaults in a
Zope that depends heavily on PythonScripts.

The only C compiled Product that is used is ZMySQLDA with the last
versions of everything I could find, and I doesn't seem to be the cause
of the problem, since disabling gc fixes the problem (just like it fixed
with 2.4.3) but causes a leak that requires us to restart the ZEO Client
(which is what is segfaulting. The ZEO server runs without a glitch)
every 6AM.

I have a feeling that it's crashing less frequently than Zope 2.4.3, but
I cannot confirm this. It's crashing every 20m aproximately, when gc is
enabled.

I have run out of options as well. We must keep the server under
surveillance at all times because sometimes the morning restart isn't
enough to contain the leak before the machine starts swapping.

Cheers, Leo

On Wed, 2002-02-13 at 16:36, Dario Lopez-Kästen wrote:
 Hi all!

 I am sorry to report that Zope 2.4.4b1 dumps core. I have filed a bug
issue
 to the collector, and I am also seeking advice on possible ways out of my
 misery :)

 I also have attached to the collector issue two tracebacks from two
 different core dumps.

 Here is the setup:

 Zope 2.4.4b1, source release
 Python 2.1.2 source w/o pymalloc
 RedhatLinux 7.2, kernel 2.4.7-10 on an P4 machine

 We use the following products

 Formulator 1.0 .1- sligthly modified by one consultant
 LocalFS, latest
 Transparent folders, latest
 ReplaceSupport, latest
 Stripogram, latest
 DCO2, latest from CVS

 We have a lot of Oracle operations going on on each request, and we use a
 *lot' of PythionScripts.

 Any ideas on what might be wrong?

 /dario - very, very weary of these core dumps.

 - 
 Dario Lopez-Kästen Systems Developer  Chalmers Univ. of Technology
 [EMAIL PROTECTED]  ICQ will yield no hitsIT Systems  Services



 ___
 Zope-Dev maillist  -  [EMAIL PROTECTED]
 http://lists.zope.org/mailman/listinfo/zope-dev
 **  No cross posts or HTML encoding!  **
 (Related lists -
  http://lists.zope.org/mailman/listinfo/zope-announce
  http://lists.zope.org/mailman/listinfo/zope )


--
Ideas don't stay in some minds very long because they don't like
solitary confinement.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists -
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists -
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Zope 2.4.4b1 dumps core

2002-02-13 Thread Dario Lopez-Kästen


- Original Message -
From: Andreas Jung [EMAIL PROTECTED]
To: Dario Lopez-Kästen [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Wednesday, February 13, 2002 7:52 PM
Subject: Re: [Zope-dev] Zope 2.4.4b1 dumps core


 Does this problem persist when you remove the 3rd-party products ?
 Are you running Zope with enabled garbage collector ?

 - aj


Hm... I think I am - I have not done anything to turn it off.

Unfortuntatly I can't remove all 3rd party products - if I do, I don't have
much of a testcase, because then I don't have an app to test against. Like I
remove DCO2 I only get errors, because lots of data is used in presenting
layouts, pages etc.

If I remove formulator, I get errors, because the pages that I would need to
test, give errors.

I can remove TransparentFolders and possibly strippogram, but we have
previously done tests to see if TP slowed things down, but got no colcusive
results (and no decrease in core dumps either).

I can also remove Strippogram, but it is a late addition. We have had core
dumps since the days of borked Python 2.1 and Zope 2.4.3/2.4.2 - culprits
have been, in the past, old versions of DCO2, previous versions of Python
and old versions of exuserfolder (which we don't use anymore).

The only thing I have to go on is that as son as there are lots of
PythonScripts involved thins star to deteriorate.

I might also add that changing zope to use only one thread, as has been
mentioned as a possible workaround with pesky DA's, is not a realistic
option - on the contrary I need to bump up the thread count to around 10, to
be able to use more than 4 concurrent SQL queries.

I am at loss here - I don't even know where to begin looking for errors.
Maybe it's our app that is faulty (I know for a fact that around 40% of it
could be done in a cleaner way).

Oh, and the icing on the cake is that allmost all of my PythonScripts need
to be recompiled, all of a sudden. All I did was to pack the ZODB, and copy
it to production environment. The app is only about 6 megs in zie all in
all, so I wouldn't expect any serious ZODB corruption (as the bulk of the
data.fs is moslty app-logic, I would expect corruption to show itself real
fast, if it existed).

So, apart from killing myself, is there a way out of this? Or at least a
general direction in which to start looking for possible solutions?

I have to deploy this app 2 days ago, so I 'll try to setup as many
safegueards as I can. After that, in a sanndbox, I'll set my app up with as
few 3rd party extensions as possible, and see if it helps.

Any insight is greatly appreciated. I have saved coredumps if anyone would
care to dig thru 20-40 megs of data :-).

Sincerely,

/dario




___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists -
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )