Re: [Zope-dev] Zope 2.4.4b1 dumps core
Jeremy Hylton writes: ... memory corruption ... Does the community have any Zen about how to narrow down bugs like this? I once used purify to analyse this type of problem. I was not easy: purify slowed Zope down by one to two magnitudes. It has been only feasible because the mean time between two crashes have been only a few minutes (which turned into some half hours when run under purify). Dieter ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Zope 2.4.4b1 dumps core
AJ == Andreas Jung [EMAIL PROTECTED] writes: AJ Does this problem persist when you remove the 3rd-party products AJ ? Are you running Zope with enabled garbage collector ? Just a reminder that you *should* be running with the garbage collector enabled. We are aware of no current bugs in the garbage collector, which has been in use since Python 2.0. If you see a crash and the gdb stack trace points to the garbage collector, it is almost surely a sign of memory corruption caused by bugs elsewhere in Python or in a C extension. You can prevent those crashes by disabling the garbage collector, but you are more likely to get bizarre errors or storage corruptions since the interpreter memory is corrupted. Jeremy ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Zope 2.4.4b1 dumps core
Python 2.1.2 has some extra safety checks that will immediately detect the stack overflow bugs that caused problems with PythonScripts before 2.4.4. The fact that you're seeing crashes in the garbage collector and not assertion failures in Python/ceval.c makes me think the problem isn't with the PythonScripts. I wouldn't rule it out completely, but it seems unlikely. Does the community have any Zen about how to narrow down bugs like this? It seems a daunting task, in general, because it's not obvious what particular request exercises the bug, and the error report doesn't come until long after the bug occurs. I wonder if you could crank up the garbage collection frequency -- either use gc.set_threshold() to see the threshold very low or add an explicit call to gc.collect() after each request. If you're lucky, this would cause the bug to be detect -- like right after the request that exercised the bug. Where exactly would you put the gc.collect() call to make this work? Jeremy ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Zope 2.4.4b1 dumps core
Hi, As I mentioned in this list before, I'm also getting the segfaults in a Zope that depends heavily on PythonScripts. The only C compiled Product that is used is ZMySQLDA with the last versions of everything I could find, and I doesn't seem to be the cause of the problem, since disabling gc fixes the problem (just like it fixed with 2.4.3) but causes a leak that requires us to restart the ZEO Client (which is what is segfaulting. The ZEO server runs without a glitch) every 6AM. I have a feeling that it's crashing less frequently than Zope 2.4.3, but I cannot confirm this. It's crashing every 20m aproximately, when gc is enabled. I have run out of options as well. We must keep the server under surveillance at all times because sometimes the morning restart isn't enough to contain the leak before the machine starts swapping. Cheers, Leo On Wed, 2002-02-13 at 16:36, Dario Lopez-Kästen wrote: Hi all! I am sorry to report that Zope 2.4.4b1 dumps core. I have filed a bug issue to the collector, and I am also seeking advice on possible ways out of my misery :) I also have attached to the collector issue two tracebacks from two different core dumps. Here is the setup: Zope 2.4.4b1, source release Python 2.1.2 source w/o pymalloc RedhatLinux 7.2, kernel 2.4.7-10 on an P4 machine We use the following products Formulator 1.0 .1- sligthly modified by one consultant LocalFS, latest Transparent folders, latest ReplaceSupport, latest Stripogram, latest DCO2, latest from CVS We have a lot of Oracle operations going on on each request, and we use a *lot' of PythionScripts. Any ideas on what might be wrong? /dario - very, very weary of these core dumps. - Dario Lopez-Kästen Systems Developer Chalmers Univ. of Technology [EMAIL PROTECTED] ICQ will yield no hitsIT Systems Services ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope ) -- Ideas don't stay in some minds very long because they don't like solitary confinement. ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Zope 2.4.4b1 dumps core
This won't help the crashing problem, but for the leak problem you may want to consider using AutoLance: http://www.zope.org/Members/mcdonc/Products/AutoLance - Original Message - From: Leonardo Rochael Almeida [EMAIL PROTECTED] To: Zope Developers list [EMAIL PROTECTED] Sent: Wednesday, February 13, 2002 2:38 PM Subject: Re: [Zope-dev] Zope 2.4.4b1 dumps core Hi, As I mentioned in this list before, I'm also getting the segfaults in a Zope that depends heavily on PythonScripts. The only C compiled Product that is used is ZMySQLDA with the last versions of everything I could find, and I doesn't seem to be the cause of the problem, since disabling gc fixes the problem (just like it fixed with 2.4.3) but causes a leak that requires us to restart the ZEO Client (which is what is segfaulting. The ZEO server runs without a glitch) every 6AM. I have a feeling that it's crashing less frequently than Zope 2.4.3, but I cannot confirm this. It's crashing every 20m aproximately, when gc is enabled. I have run out of options as well. We must keep the server under surveillance at all times because sometimes the morning restart isn't enough to contain the leak before the machine starts swapping. Cheers, Leo On Wed, 2002-02-13 at 16:36, Dario Lopez-Kästen wrote: Hi all! I am sorry to report that Zope 2.4.4b1 dumps core. I have filed a bug issue to the collector, and I am also seeking advice on possible ways out of my misery :) I also have attached to the collector issue two tracebacks from two different core dumps. Here is the setup: Zope 2.4.4b1, source release Python 2.1.2 source w/o pymalloc RedhatLinux 7.2, kernel 2.4.7-10 on an P4 machine We use the following products Formulator 1.0 .1- sligthly modified by one consultant LocalFS, latest Transparent folders, latest ReplaceSupport, latest Stripogram, latest DCO2, latest from CVS We have a lot of Oracle operations going on on each request, and we use a *lot' of PythionScripts. Any ideas on what might be wrong? /dario - very, very weary of these core dumps. - Dario Lopez-Kästen Systems Developer Chalmers Univ. of Technology [EMAIL PROTECTED] ICQ will yield no hitsIT Systems Services ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope ) -- Ideas don't stay in some minds very long because they don't like solitary confinement. ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope ) ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Zope 2.4.4b1 dumps core
- Original Message - From: Andreas Jung [EMAIL PROTECTED] To: Dario Lopez-Kästen [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Wednesday, February 13, 2002 7:52 PM Subject: Re: [Zope-dev] Zope 2.4.4b1 dumps core Does this problem persist when you remove the 3rd-party products ? Are you running Zope with enabled garbage collector ? - aj Hm... I think I am - I have not done anything to turn it off. Unfortuntatly I can't remove all 3rd party products - if I do, I don't have much of a testcase, because then I don't have an app to test against. Like I remove DCO2 I only get errors, because lots of data is used in presenting layouts, pages etc. If I remove formulator, I get errors, because the pages that I would need to test, give errors. I can remove TransparentFolders and possibly strippogram, but we have previously done tests to see if TP slowed things down, but got no colcusive results (and no decrease in core dumps either). I can also remove Strippogram, but it is a late addition. We have had core dumps since the days of borked Python 2.1 and Zope 2.4.3/2.4.2 - culprits have been, in the past, old versions of DCO2, previous versions of Python and old versions of exuserfolder (which we don't use anymore). The only thing I have to go on is that as son as there are lots of PythonScripts involved thins star to deteriorate. I might also add that changing zope to use only one thread, as has been mentioned as a possible workaround with pesky DA's, is not a realistic option - on the contrary I need to bump up the thread count to around 10, to be able to use more than 4 concurrent SQL queries. I am at loss here - I don't even know where to begin looking for errors. Maybe it's our app that is faulty (I know for a fact that around 40% of it could be done in a cleaner way). Oh, and the icing on the cake is that allmost all of my PythonScripts need to be recompiled, all of a sudden. All I did was to pack the ZODB, and copy it to production environment. The app is only about 6 megs in zie all in all, so I wouldn't expect any serious ZODB corruption (as the bulk of the data.fs is moslty app-logic, I would expect corruption to show itself real fast, if it existed). So, apart from killing myself, is there a way out of this? Or at least a general direction in which to start looking for possible solutions? I have to deploy this app 2 days ago, so I 'll try to setup as many safegueards as I can. After that, in a sanndbox, I'll set my app up with as few 3rd party extensions as possible, and see if it helps. Any insight is greatly appreciated. I have saved coredumps if anyone would care to dig thru 20-40 megs of data :-). Sincerely, /dario ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )