Andi,
I recompiled the latest version from the trunk and this doesn't seem to fix
my problems with my bigram "BrianAnalyzer". It still seems to increase
refcounts by about 3000 per 1000 documents. It seems like a very simple
subclass with only a single instance. I wasn't able to tell if it fell into
the first or the second class of problems you describe (apparently it falls
into the second class?).
-brian
On 1/10/08, Andi Vajda <[EMAIL PROTECTED]> wrote:
>
>
> In the past few days, several users reported symptoms of memory leaks in
> jcc-PyLucene. After a bit of sleuthing, I found two leaks:
>
> 1. A reference to a Python Java wrapper was leaked whenever an
> inherited
> method was called on the Java object from Python (in callSuper).
> Fixing this one was trivial and is checked in to the svn trunk.
>
> I don't know if this fixes the cases reported but it certainly has
> a major impact. For instance, any time searcher.search(query) is
> called, the searcher is leaked (!).
>
> To verify this, run:
> > python test/test_PyLucene.py
> Test_PyLuceneWithFSStore.test_searchDocuments -loop
> Without the fix, after a short while, the VM runs out of memory.
> With the fix, it seems to be running forever (and speed remains
> more or
> less constant)
>
> 2. A "deadly embrace" between a Python extension instance and its Java
> parent instance is currently preventing Python extention instances
> and their Java parent from ever being freed. The Python extension
> instance is holding a reference to the Java parent instance and the
> Java parent instance is holding a reference to the Python extension
> instance.
> Without some explicit intervention, this cross-VM cycle can't be
> broken. I'm currently thinking of making it possible to call
> finalize()
> on these objects manually to break the cycle. I'm also thinking of
> adding a GC thread to the process that would garbage collect the
> extension instances with no more than two counted python refs. This
> thread, combined with weak global refs on the JNI side could make
> collecting these Python extension instances semi-automatic.
> Needless
> to say, I don't like this idea too much and I'm trying to find
> another
> less complicated way. In the meantime, I think I'm going to be
> adding
> support for the manual way via finalize() shortly.
>
> This leak (still in svn trunk), is not normally that bad, as Python
> extension instances are rarer (than the earlier leak) and leaking
> them
> is, normally, not as deadly. Still, there are cases where it is bad
> as
> when implementing a Python extension for Directory and its sibling
> classes.
> To see for yourself, try running test/test_PythonDirectory.py in a
> loop. This leak is a problem in the current Chandler release, for
> example, where such a Python extension is used.
>
> More on this leak in the next few days.
>
>
> In order to debug these leaks, I improved env._dumpRefs() a bit by adding
> some keywords to it.
>
> _dumpRefs() now can be called in three ways:
>
> - _dumpRefs(): returns a list of tuples (system hash id, ref count)
> these are useful for quickly getting an idea of how many global Java
> referenced objects there are at the moment (these objects are not
> GC'ed
> by Java until removed from the refs table)
>
> - _dumpRefs(classes=True): returns a dict of { className: instance
> count }
> to get an idea about how many instances of various classes are being
> thus kept from being GC'ed by Java
>
> - _dumpRefs(values=True): returns a list of tuples (value string, ref
> count) to get an idea of what the values look like. This is to be
> used
> with caution a printing out Java values can be expensive.
>
> It would be interesting to see if the people who recently reported memory
> leak symptoms on this list could try the current trunk and report if that
> solves their problem or at least, improves on it.
> If you are re-building PyLucene from the trunk to try this out, be sure to
> completely rebuild jcc first (the fix is there).
>
> Thank you for your patience !
>
> Andi..
> _______________________________________________
> pylucene-dev mailing list
> [email protected]
> http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
>
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev