Re: [Zope-dev] Re: more on the segfault saga

2002-03-25 Thread Leonardo Rochael Almeida


Hello segfaulters and others interested in Zope instability issues!

Our demi-god Matt Kromer from ZopeCorp has come up with a possible way
to corner the instability issue AND give you a stable, cycle-garbage
collecting Zope.

Since the problem seems, so far, to be caused by the Python Restricted
Compiler (which is used in everything from dtml expressions to python
scripts to other stuff) not completing fully collectable objects before
the Python cycle garbage collector finds them, the solution is to lock
out the gc while creating these objects. The only easy way to do this
currently is to disable the automatic gc and run manual garbage
collections only when we're pretty sure no one else is running, and at
the same time not letting anyone else run when we're running the gc.

While I can't speak for Matt but since this is a fairly urgent matter, I
believe he agrees that those experiencing segfaults are encouraged to
replace Zope/ZServer/PubCore/ZServerPublisher.py with the attached
file, which should work on 2.4.x and 2.5.x series Zopes, and report your
instability results.

This is the same file that can be found at:
http://zope.org/Members/matt/ZServerPublisher.py
with the difference that my version has some lines removed that are only
interesting for those that applied Matt's cprof patches mentioned
earlier on this list (which, I bet, means only me :-).

The file is small enough so that you can manually look and see that I've
installed no trojans in it :-) but those of a paranoid nature are
encouraged to download Matt's version and remove the two lines that
mention 'cprof'.

We're close guys, very close.

Cheers, Leo

PS: standard disclaimers: I don't speak for anyone else but me and I
won't be held responsible for anything you do to your site based on the
aforementioned intructions. If you break your site with them, you get to
keep both pieces :-)

-- 
Ideas don't stay in some minds very long because they don't like
solitary confinement.



##
#
# Copyright (c) 2001 Zope Corporation and Contributors. All Rights Reserved.
# 
# This software is subject to the provisions of the Zope Public License,
# Version 2.0 (ZPL).  A copy of the ZPL should accompany this distribution.
# THIS SOFTWARE IS PROVIDED AS IS AND ANY AND ALL EXPRESS OR IMPLIED
# WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
# WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS
# FOR A PARTICULAR PURPOSE
# 
##
from ZPublisher import publish_module

import ThreadLock

gc_lock = ThreadLock.allocate_lock()
active_threads = [0,]

class ZServerPublisher:
def __init__(self, accept):
   import os
   import sys
   import gc

   gc.disable()

   while 1:
   try:
   name, request, response=accept()

   gc_lock.acquire()
   active_threads[0] = active_threads[0] + 1
   gc_lock.release()

   publish_module(
   name,
   request=request,
   response=response)
   finally:
   response._finish()
   request=response=None
   gc_lock.acquire()
   a = active_threads[0] - 1

   if a == 0:
   #sys.stderr.write(Invoking gc.collect()\n)
   gc.collect()
   else:
   #sys.stderr.write(Skipping gc.collect(), %d threads active\n% a)
   pass

   active_threads[0] = a
   gc_lock.release()



Re: [Zope-dev] Re: more on the segfault saga

2002-03-21 Thread Leonardo Rochael Almeida

Ok, got some data on using this patches.

First of all, for those following, these patches don't seem to work well
if starting Zope as root, cause gdb will be started as the user Zope
turns to, and this gdb won't be able to attach to a root started
process, even if it's dropped it's privileges.

Now, the gdb.cmd script that comes with it is not being able to make the
trace_dump file for some reason. Below are the urls to Zope's stdout/err
in  2 segfault instances, one generated by an external method that calls
cprof.segfault() and another that was generated by normal load.

http://www.ibccrim.org.br/imagens/data-temp/stdout-20020321-ext-method-segfault
http://www.ibccrim.org.br/imagens/data-temp/stdout-20020321-natural-segfault

The 'No such process' message might be caused by the process dying while
trying to generate the file in the trace_dump() call, but I don't know
why would that be.

I'll see if I can install another Zope instance where it all belongs to
another user, so that we can rule out lack of permissions for this
problem.

On Tue, 2002-03-19 at 18:10, Matthew T. Kromer wrote:
 Leonardo Rochael Almeida wrote:
 
 
 The official unofficial Zope place on irc is #zope at
 irc.openprojects.net. Lots of cool and very knowledgeable people hang
 out there.
 
 
 OK, I put up a set of patches and a rather frazzled looking README for a 
 profiler patch to Python at
 
 http://www.zope.org/Members/matt
 
 You want the C profiler patch; you have to build your OWN python 2.1.2 
 and it will probably only work under Linux -- dont bother with Windows, 
 parts of the code use mmap() for speed and Windows doesn't provide mmap.
 
 There's a README document inside that has some rather vague and minimal 
 installation instructions.  This is very definately use-at-your-own-risk 
 stuff.  I'm posting notice here because others are interested in trying 
 to help diagnose the Zope crashing problem so this serves as a reminder 
 of where something is as it sits in your inbox waiting for bits to decay.
 
 Here's the readme in its entirety:
 
 
 To activate python tracing
 
 Rebuild a clean python 2.1.2 with the two patches (included) applied.
 
 Patch 1 is for the garbage collector module, it installs a segfault handler
 which allows for an environment variable  CRASHCMD  to be present to
 tell python what to do in the event of a segfault.
 
 Patch 2 is a patch to ceval.c which builds in addtional tracing.
 
 The cprof module must be built; a simple
 
 make -f Makefile.pre.in PYTHON=/path/to/rebuilt/python2.1.2
 
 will build the cprof module.
 
 
 Once built, test the cprof module
 
 
 /path/to/rebuild/python2.1.2
 
  import cprof
  cprof.activate()
  cprof.dump(filename)
 
 and the filename specified should be created.  For the curious, the pb.py
 program will play back the trace file to get data out of it.
 
  PATCHING ZOPE TO USE THIS 
 
 Replace Zope's ZServer/PubCore/ZServerPublisher file with the included one.
 Patch the line that contains the gdb command to point to your rebuilt 
 python.
 
 Copy the file gdb.cmd to where you start Zope.
 
 Copy the file cprof.so to lib/python in your Zope directory
 
 Start Zope.  Wait.  GDB will be invoked to gather crash data, save the
 gdb output if possble (keep stdout from gdb).
 
 
 Unfortunately, the README forgets to mention that you need to run Zope 
 under the patched python.  Whoops.
 
 -- 
 Matt Kromer
 Zope Corporation  http://www.zope.com/ 
 
 
 
 
-- 
Ideas don't stay in some minds very long because they don't like
solitary confinement.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Re: more on the segfault saga

2002-03-19 Thread Matthew T. Kromer

Leonardo Rochael Almeida wrote:


The official unofficial Zope place on irc is #zope at
irc.openprojects.net. Lots of cool and very knowledgeable people hang
out there.


OK, I put up a set of patches and a rather frazzled looking README for a 
profiler patch to Python at

http://www.zope.org/Members/matt

You want the C profiler patch; you have to build your OWN python 2.1.2 
and it will probably only work under Linux -- dont bother with Windows, 
parts of the code use mmap() for speed and Windows doesn't provide mmap.

There's a README document inside that has some rather vague and minimal 
installation instructions.  This is very definately use-at-your-own-risk 
stuff.  I'm posting notice here because others are interested in trying 
to help diagnose the Zope crashing problem so this serves as a reminder 
of where something is as it sits in your inbox waiting for bits to decay.

Here's the readme in its entirety:


To activate python tracing

Rebuild a clean python 2.1.2 with the two patches (included) applied.

Patch 1 is for the garbage collector module, it installs a segfault handler
which allows for an environment variable  CRASHCMD  to be present to
tell python what to do in the event of a segfault.

Patch 2 is a patch to ceval.c which builds in addtional tracing.

The cprof module must be built; a simple

make -f Makefile.pre.in PYTHON=/path/to/rebuilt/python2.1.2

will build the cprof module.


Once built, test the cprof module


/path/to/rebuild/python2.1.2

 import cprof
 cprof.activate()
 cprof.dump(filename)

and the filename specified should be created.  For the curious, the pb.py
program will play back the trace file to get data out of it.

 PATCHING ZOPE TO USE THIS 

Replace Zope's ZServer/PubCore/ZServerPublisher file with the included one.
Patch the line that contains the gdb command to point to your rebuilt 
python.

Copy the file gdb.cmd to where you start Zope.

Copy the file cprof.so to lib/python in your Zope directory

Start Zope.  Wait.  GDB will be invoked to gather crash data, save the
gdb output if possble (keep stdout from gdb).


Unfortunately, the README forgets to mention that you need to run Zope 
under the patched python.  Whoops.

-- 
Matt Kromer
Zope Corporation  http://www.zope.com/ 




___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Re: more on the segfault saga

2002-03-15 Thread Matthew T. Kromer

Hi Martijn,

We're basically just trying to construct traps to try to identify a 
smoking gun.  The upside is, if it works, we'll be able to fix the bug 
very quickly.  However, its based on assumptions about the exact nature 
of the bug -- so each trap I write essentially is making a hypothesis 
and then testing it.

Because Leo can get the crash very quickly, if you have a difficult time 
reproducing it, you don't need to spend a lot of effort trying to keep 
up with my traps.


On Friday, March 15, 2002, at 06:19 AM, Martijn Jacobs wrote:


 Hello Leo, Matt, Brian,

 I'm on it. Will send results when they're available. If anyone wants
 to talk to me during the period, I'll be on IRC.

 If you need any assistance for anything, I'm at your service
 Which channel/server are you on IRC?

 Did somebody  succeed reproducing the crash? We try the best we can to
 make a reproducable testcase, but Zope doesn't want to crash here... The
 clients who use the production Zope which crashes are all using Active
 Desktop (I know :( ), could that be of any matter?
 Technically it's insane if it does matter, but you never know...

 I'm out of capabilities right now, don't know what to do anymore, so I
 hope the bug will be found soon.


 regards,

 martijn










 ___
 Zope-Dev maillist  -  [EMAIL PROTECTED]
 http://lists.zope.org/mailman/listinfo/zope-dev
 **  No cross posts or HTML encoding!  **
 (Related lists -
  http://lists.zope.org/mailman/listinfo/zope-announce
  http://lists.zope.org/mailman/listinfo/zope )


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Re: more on the segfault saga

2002-03-15 Thread Leonardo Rochael Almeida

On Fri, 2002-03-15 at 08:19, Martijn Jacobs wrote:
 
 Hello Leo, Matt, Brian, 
 
  I'm on it. Will send results when they're available. If anyone wants 
  to talk to me during the period, I'll be on IRC.
 
 If you need any assistance for anything, I'm at your service
 Which channel/server are you on IRC?

The official unofficial Zope place on irc is #zope at
irc.openprojects.net. Lots of cool and very knowledgeable people hang
out there.

I'll be there today while I apply Matt's incref patches and run Zope

I also have a very demanding client who goes bezerk every time the site
is down, so I recomend you do the following, if you want to help with
debugging (this assumes you run Zope behind a proxy server such as
apache or squid):

* Install ZEO on your current Zope and configure both the ZEO Client and
Server on the same machine serving your site. Only the ZEO Client should
get the segfaults and it restarts much faster (less than 10 secs,
usually) than in standalone mode.

* Open a source Zope package in another directory. Open a Python source
package next to it. Configure Python to install it's files inside this
Zope tree (./configure --prefix=/path/to/Zope-src). Apply Matt's
patches, make and make install. Install ZEO in this instance but only
configure the ZEO Client, making it listen in a different port from the
other Zope. Copy over all the external methods and extra products, and
make it access the other instance ZEO Server. Don't forget to REDIRECT
STDERR TO A FILE (the best way is to redirect stderr to stdout and
append stdout to a file). Start it and check that it's working as
expected.

* Keep two configuration files of your frontend proxy around: one
pointing the site to the original Zope and another pointing the site to
the instrumented Zope. When you want to test the crashes, switch the
conf. files around and reload the proxy.

* Report everything you find in Zope stderr.

* If you want to increase the perceived stability of your site, put the
two following lines somewhere in the original Zope z2.py:

import gc
gc.disable()

It should stop crashing, but it'll start leaking instead. If the leak
isn't so severe that it allows you to restart only once a day, in the
period of least traffic, then leave it that way. Having ZEO Client will
ensure you have the least amount of downtime possible in this restart.

-- 
Ideas don't stay in some minds very long because they don't like
solitary confinement.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



  Re: [Zope-dev] Re: more on the segfault saga

2002-03-14 Thread Martijn Jacobs


Sorry, the correct URL is http://www.coherence.nl/crash.txt
(without the dot)


martijn.




___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Re: more on the segfault saga

2002-03-13 Thread Leonardo Rochael Almeida

The 

On Wed, 2002-03-13 at 10:05, Martijn Jacobs wrote:
 [...]
 
 I don't know where to start, because attaching GDB doesn't make any
 sense, since you have to start zope single threaded (according to Matts
 Stability Howto) and then no crashes occur.

Actually, at least in Linux, with a recent gdb, you can attach gdb to
zope in multithread mode. Just take the -t 1 from the command line
sugested by the StabilityHOWTO and you're set. Best results are achieved
by compiling everything from source (python even, use the
--prefix=/path/to/zope-src so as not to mix up with your installed
python and be careful to use this python when installing zope) and
running:

$ VARIABLE=value gdb path/to/your/python
(gdb) run z2.py -Z '' 

where VARIABLE=value should be replaced by the env vars that are set in
the ./start script inside Zope.

 Is this problem solved if I install python 2.2 for example? Are there
 any bugfixes in this release from Python 2.1.2 ?

No, as far as I know.

 I don't know what the status is right now? Is zope corp. working on it
 trying to find the bug? Can I be of any help tracking down this bug? 

I don't know about Zope Corp. in general, but Matt Kromer has been
trying to help as much as his time permits.

I think you're helping a lot just by reporting this problem because it
helps raise awarenes to the fact that the stability problems aren't all
solved with the last Zope/Python releases. So far there are three
confirmed cases of instability: yours, mine and Dario's. All of them
seem to involve PythonScripts, although this might not be related, and
all of them are solved by using '-t 1' (is that correct, Dario?) so it
looks like a threading issue.

Let's just hope ZC or someone else in the community with more knowledge
of the Zope/Python internal arcana can help us debug this, 'cause it's
reached the limit of our exploration capability.

Cheers, Leo


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Re: more on the segfault saga

2002-03-13 Thread Martijn Jacobs


 Actually, at least in Linux, with a recent gdb, you can attach gdb to
 zope in multithread mode. Just take the -t 1 from the command line
 sugested by the StabilityHOWTO and you're set. Best results are achieved
 by compiling everything from source (python even, use the
 --prefix=/path/to/zope-src so as not to mix up with your installed
 python and be careful to use this python when installing zope) and
 running:


Ok, I succeeded tying up the gdb on the production server. I have to
wait until tomorrow for results, because in the evening the intranet is
not used by the specific company :) Tomorrow it will crash for sure,
because it crashes about 20/30 times a day, so then I will post the
results as soon as possible!

It's very frustrating that we cannot reproduce this bug in out own
environment, whatever we try. (all workstations requesting like hell,
but we cannot succeed crashing it!)

It's very nice to hear that you people are trying to solve the problem,
also thanks to the guys from Zope Corp. who are spending there time for
it!

Hope the bug will be resolved soon.


kind regards,

martijn jacobs


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )