Re: [Zope-dev] Very severe memory leak

2003-08-24 Thread Jens Vagelpohl
Well, to at least reduce the immediate pressure, why don't you throw 
more RAM into that server? Memory is cheap.

jens

On Friday, Aug 22, 2003, at 17:38 US/Eastern, Leonardo Rochael Almeida 
wrote:

Hi,

For a long time now, one of our clients, running Zope 2.5.1, has been
experiencing memory leaks. For a time this has been relieved by
restarting Zope every day at 4am.
Lately this was not enough, however, as Zope started taking more and
more memory, to the point that it frequently required more than one
restart during the day.
The machine this is running on is a Xeon 500 MHz with 512 Mb of memory.
Zope is a very memory intensive application, but 512Mb used to be 
enough
for a medium sized portal.

This Zope site makes very heavy use of both ZClasses and ZCatalog, and
it appears to me that this worsening of perfomance and memory
consumption was being caused by the increase in content, which caused 
an
increase in the size of the ZCatalog and in the number of ZClass
instances. A total of 6796 ZClass instances representing News Items are
cataloged. There are other ZCatalogs and ZClass instances representing
calendar events and other stuff (this is a very old Zope portal coded 
by
hand in ZClasses, no CMF, no Plone, not even Pagina1, our ZClass based
CMS)

Fiddling with the cache parameters in the control panel showed that
while we could keep the memory consumption to a point where the daily
restart would be enough (say, with 10k objects per thread), we would 
get
constant thrashing of cache objects, specially DateTime objects due to
the ZCatalog queries, and the machine performance would be close to
intolerable, whereas if the cache parameters where set to allow a fast
performance (with 50k objects per thread) the machine would run out of
memory in 3 to 4 hours. Needless to say this was with heavy use of
RAMCacheManagers, not counting the accelerator proxy in front of it.
Without the RAMCaches, the machine would go down in under 5 minutes of
work hour load. Even with the caches on, the load would never go down
from 2.0 during work hours.

Last tuesday we decided no longer to wait for 2.6.2 and migrated the
site to 2.6.1. We dealt with the ObjectManager-based-ZClass issue,
reformed the ZCatalogs to replace the DateTime FieldIndexes with
DateTimeIndexes and then had a working testing environment, which we
stress tested lightly without detecting any problems and quickly moved
to production. This was late at night
The next morning we were surprised to notice the machine very quickly
ran out of memory. The memory leak was *far more severe* than before.
Zope needed a restart every 15m or so before it would send the machine
into heavy swaping.
In a very non-intuitive hunch I suggested we shut down all RAMCaches
and, amazingly enough, this made the situation a bit more manageable.
We're now restarting every 45 minutes. To our relief, disabling the
RAMCaches had only a barely noticeable effect on performance. The site
kept churning out pages really fast, a testatment to the optimization
job done in the 2.6 series. The load on the machine is rarely above 
0.8,
except when it goes into swap :-)

The number of DateTime refcounts in the Control_Panel, although much
smaller than in Zope 2.5.1 is very high and, mostly importantly,
constantly increasing, as far as I can tell. After 12 minutes of 
uptime,
the top refcounts are:

DateTime.DateTime.DateTime: 96168
BTrees._IOBTree.IOBucket: 43085
BTrees._IIBTree.IIBTree: 40400
BTrees._IIBTree.IIBucket: 23696
OFS.DTMLDocument.DTMLDocument: 23190
BTrees.OIBTree.OIBucket: 14582
BTrees._IIBTree.IISet: 12479
BTrees._IIBTree.IITreeSet: 10823
BTrees.OOBTree.OOBucket: 7088
OFS.Image.Image: 6860
OFS.DTMLMethod.DTMLMethod: 5894
DocumentTemplate.DT_Util.Eval: 3250
OFS.Image.File: 2796
BTrees._IOBTree.IOBTree: 2761
ZClasses.Method.MWp: 1592
In time, DateTime refcounts eventually dwarves the second place by an
order of magnitude. I think this is related to the fact that DateTime
instances are stored as metadata, even though the date indexes have 
been
converted to DateTime indexes. The question is, why aren't those
instances being released? What is holding on to them?

I tried installing the LeakFinder product but discovered it didn't work
before stumbling in a message in the archives that told me exactly that
:-) The RefCounts view in the LeakFinder object fails with the 
following
traceback:

Traceback (innermost last):
  [...]
  Module DocumentTemplate.DT_Util, line 201, in eval
   - __traceback_info__: REQUEST
  Module string, line 0, in ?
  Module Products.LeakFinder.LeakFinder, line 240, in manage_getSample
  Module Products.LeakFinder.LeakFinder, line 163, in 
getControlledRefcounts
  Module Products.LeakFinder.LeakFinder, line 188, in resetCache
TypeError: function takes at most 2 arguments (3 given)

The code in question is:

def resetCache(c, gc):
cache = c._cache
if gc:
cache.minimize(3)  # The minimum that actually 
performs gc.
 

Re: [Zope-dev] Very severe memory leak

2003-08-24 Thread Shane Hathaway
On 08/22/2003 05:38 PM, Leonardo Rochael Almeida wrote:
In time, DateTime refcounts eventually dwarves the second place by an
order of magnitude. I think this is related to the fact that DateTime
instances are stored as metadata, even though the date indexes have been
converted to DateTime indexes. The question is, why aren't those
instances being released? What is holding on to them?
When you flush the cache, those DateTimes should disappear.  If they 
don't, the leak is keeping them.

I tried installing the LeakFinder product but discovered it didn't work
before stumbling in a message in the archives that told me exactly that
:-) The RefCounts view in the LeakFinder object fails with the following
traceback:
LeakFinder is an early attempt to share some of the techniques I use for 
finding leaks.  Unfortunately, those techniques aren't very useful until 
you've already searched for weeks, which means LeakFinder isn't very 
good for emergencies.

Here are the things you should look at first:

1) The ZODB cache size.  The meaning of this number changed dramatically 
in 2.6.  Before 2.6 it was a very vague number.  In 2.6 it's a target 
number of objects that Zope actually tries to maintain.  Before 2.6 it 
might have made sense to set the ZODB cache size to some arbitrarily 
high number like 100,000; in 2.6 you want to start at about 2000 and 
adjust from there.  There are tools in 2.6 for helping you adjust the 
number.

2) The number of ImplicitAcquisitionWrappers present in the system.  I 
have found it to be a reliable indicator of whether you have a leak or 
not.  Expect this number to stay under 400 or so.  If it grows 
gradually, there's a leak.  Watch the refcounts screen.

3) Is Python compiled with cyclic garbage collection enabled?  2.4 and 
above absolutely require cyclic garbage collection.

4) Don't use 2.6.1.  Use 2.6.2, which has fixes for known leaks.  It is 
actually already tagged in CVS as Zope-2_6_2, and it's what zope.org 
is now running.  Various unrelated things prevented a formal release 
this week.

If all else fails, grep all Python modules for sys._getframe() and 
sys.exc_info().  These are the primary causes of memory leaks in 
Python 2.1 and below.  If you're brave, you can just run Zope under 
Python 2.2, which fixes those particular leaks AFAIK.

Finally, there's always hope. :-)  The latest thing I've been doing is 
running Zope in a debug build of Python.  A debug build makes a magical 
sys._getobjects() available, allowing you to inspect all live objects 
through a remote console.  Since debug builds aren't much slower than 
standard builds, you can even run a debug build in production for short 
periods of time.  I've been building a small library of functions for 
working in this mode, and if you need them, I'll pass them along.  I'd 
have to warn you that they are anything but intuitive in their purpose 
and use, though.

Shane

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] Can't build 2.6.2-b5 on Redhat 7.3

2003-08-24 Thread Anthony Baxter

 Paul Winkler wrote
 Like the subject says... python2.1 wo_pcgi fails...
 
 this is the same python 2.1.3 that I built from source, and which
 I used to build and run zope 2.6.1 for a few months now...
 
 gcc -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -g -O2 -Wall
   -Wstrict-prototypes -fPIC 
   -I/zope/Zope-2.6.2b5-src/lib/Components/ExtensionClass/src
   -I/usr/local/include/python2.1 -c AccessControl/cAccessControl.c 
   -o build/temp.linux-i686-2.1/cAccessControl.o
 In file included from AccessControl/cAccessControl.c:54:
   /zope/Zope-2.6.2b5-src/lib/Components/ExtensionClass/src/ExtensionClass.h:94:20:
   Python.h: No such file or directory

It can't find Python.h - it's looking in /usr/local/include/python2.1
for it - is the file there? Is it readable by the user that the build
is running under?

Anthony
-- 
Anthony Baxter [EMAIL PROTECTED]   
It's never too late to have a happy childhood.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] Can't build 2.6.2-b5 on Redhat 7.3

2003-08-24 Thread Dieter Maurer
Paul Winkler wrote at 2003-8-20 19:19 -0400:
  Like the subject says... python2.1 wo_pcgi fails...
  
  this is the same python 2.1.3 that I built from source, and which
  I used to build and run zope 2.6.1 for a few months now...
  
  
  --
  Building extension modules
  /usr/local/bin/python2.1 setup.py build_ext -i
  running build_ext
  building 'AccessControl.cAccessControl' extension
  gcc -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -g -O2 -Wall -Wstrict-prototypes 
  -fPIC -I/zope/Zope-2.6.2b5-src/lib/Components/ExtensionClass/src 
  -I/usr/local/include/python2.1 -c AccessControl/cAccessControl.c -o 
  build/temp.linux-i686-2.1/cAccessControl.o
  In file included from AccessControl/cAccessControl.c:54:
  /zope/Zope-2.6.2b5-src/lib/Components/ExtensionClass/src/ExtensionClass.h:94:20: 
  Python.h: No such file or directory
  ...

The compiler cannot find Python.h in -I/usr/local/include/python2.1.
Maybe, the Python development packages is not (correctly) installed?


Dieter

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] Very severe memory leak

2003-08-24 Thread Dieter Maurer
Leonardo Rochael Almeida wrote at 2003-8-22 18:38 -0300:
  ...
  I tried installing the LeakFinder product but discovered it didn't work
  before stumbling in a message in the archives that told me exactly that
  :-) The RefCounts view in the LeakFinder object fails with the following
  traceback:
  
  Traceback (innermost last):
[...]
Module DocumentTemplate.DT_Util, line 201, in eval
 - __traceback_info__: REQUEST
Module string, line 0, in ?
Module Products.LeakFinder.LeakFinder, line 240, in manage_getSample
Module Products.LeakFinder.LeakFinder, line 163, in getControlledRefcounts
Module Products.LeakFinder.LeakFinder, line 188, in resetCache
  TypeError: function takes at most 2 arguments (3 given)
  ...
  c._cache = PickleCache(c, cache.cache_size, cache.cache_age)

PickleCache now has dropped the cache_age parameter.
Try: PickleCache(c,cache.cache_size)


Dieter

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] __call__() takes exactly 1 argument (4 given)!!!!!!!!!!!

2003-08-24 Thread Dieter Maurer
alan milligan wrote at 2003-8-20 04:43 +:
  ...
  mapply is often tricked out by a func_code definition.
  Maybe, your class has such an attribute and it does not
  match the __call__ signature?
  
  Hmmm - this isn't MY class, it's HelpSys::STXTopic.  I can't see any 
  func_code definition in the HelpSys source.  STXTopic's __call__ function is 
  declared as __call__(self, REQUEST=None) which looks quite fine to me.
  
  Therefore mapply is getting it wrong and passing too many parameters into 
  this call.  How do I discover what mapply is passing???

You use a debugger.

I would add import pdb; pdb.set_trace() in mapply,
start Zope in a console window and make the failing request.

Zope will enter the debugger at the set_trace.
Read the pdb documentation (-- Python library reference)
about the available commands.


Dieter

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] Very severe memory leak

2003-08-24 Thread kosh
On Friday 22 August 2003 03:38 pm, Leonardo Rochael Almeida wrote:
 Hi,

 In trying to narrow down the causes of the leak, we detected what looks
 like a minor one. If you insert a very simple dtml method in Zope root
 and bang on it at a rate of, say, 100 requests per second, Zope will
 increase it's memory footprint very slowly, say, a couple hundred Ks
 every 10 minutes (both RSS and virtual size). I figure this has
 something to do with memory fragmentation or some other low level stuff.
 But this is definetly not our main concern, which is to find out how is
 it that Zope is leaking DateTimes so heavily in our site, and if it is
 really all those DateTimes that are hogging our memory. We need help,
 and we need it desperately. If anyone wants any other information we'd
 be happy to provide.

Okay here are a few things to try based on this information. 

1) Take the zclasses and burn them and do it as a python product instead. From 
items I have seen/done in the past this makes a fairly large difference for 
speed and a little in memory. However with it being faster speed you can have 
smaller caches.

2) Convert dtml objects to python scripts where feasible. Python scripts often 
run 10-40 times faster then dtml does and they seem to have a much better 
memory footprint. With them running a lot faster you can also cut down on 
your caching. You can write write scripts that are fairly complex that can 
generate ~ 200 req/s but I have not gotten even close to that with dtml.

3)If you still have problems use psyco on your python product that you use to 
replace the zclasses and bind it to the methods that make the most speed 
difference. Caching is good but having it run fast enough to not need caching 
is often better.

I have a fair number of sites now that range in size from 100M - 1GB that get 
a fair bit of usage with tens of thousands of objects and they have a memory 
usage of about 20-45M of ram under zope 2.6.1. They used to have a memory 
usage in the 80-400M range but that was before I fixed a number of things.

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] Very severe memory leak

2003-08-24 Thread Chris McDonough
Leonardo,

If you're using Linux, you might be able to take some of the pressure
off by using AutoLance to automatically restart the server when memory
grows above a certain threshhold.

http://zope.org/Members/mcdonc/Products/AutoLance/view


On Sat, 2003-08-23 at 15:17, Jens Vagelpohl wrote:
 Well, to at least reduce the immediate pressure, why don't you throw 
 more RAM into that server? Memory is cheap.
 
 jens
 



___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )