On Mon, 21 Jan 2008, anurag uniyal wrote:

It does solve the problem for Custom Analyzer and parsers etc.
But my code with Custom filters still goes out of memory.
In code below if i comment out 'result = MyFilter(result)' line
it works.

I don't seem to be able to reproduce this. It's working fine for me. I even increased the the loop to 1,000,000. Monitoring the process, its size remains constant too.

Maybe the __del__() method is causing trouble ?
But I left it in and all seemed fine for me.

So, what's different here ?

  - did you rebuild JCC ?
  - did you rebuild PyLucene ? (what's lucene.VERSION returning ?)
  - what version of Python are you using ?
  - on what OS ?
  - what version of Java ?

Andi..


----
import lucene
lucene.initVM(lucene.CLASSPATH, maxheap='1m')
from lucene import (Token, PythonAnalyzer, PythonTokenStream, 
StandardTokenizer, LowerCaseFilter)
from lia.analysis.AnalyzerUtils import AnalyzerUtils
class MyFilter(PythonTokenStream):
    count = 0
    filters = []
    def __init__(self, tokenStream):
        super(MyFilter, self).__init__()
        self.input = tokenStream
        MyFilter.count += 1
        self..id = MyFilter.count
        MyFilter.filters.append(self.id)

    def next(self):
        return self.input.next()

    def __del__(self):
        #self.input = None
        MyFilter.filters.remove(self.id)

class MyAnalyzer(PythonAnalyzer):
    def __init__(self):
        super(MyAnalyzer, self).__init__()

    def tokenStream(self, fieldName, reader):
        result = StandardTokenizer(reader)
        result = LowerCaseFilter(result)

        # my filtering
        result = MyFilter(result)

        return result
text = 'TESTING the TESTS'
analyzer = MyAnalyzer()
try:
    for i in xrange(10000):
        if i%100==0:print i
        tokens = AnalyzerUtils.tokensFromAnalysis(analyzer, text)
except lucene.JavaError,e:
    print i,e
    print "%s MyFilter remain:"%len(MyFilter.filters)
    print MyFilter.filters
-----
rgds
Anurag


----- Original Message ----
From: Andi Vajda <[EMAIL PROTECTED]>
To: [email protected]
Sent: Sunday, 20 January, 2008 7:18:34 AM
Subject: Re: [pylucene-dev] finalizing the deadly embrace


On Thu, 17 Jan 2008, Andi Vajda wrote:

Thinking about this some more, I believe that Anurag's finalizer proxy idea is on the right track. It provides the "trick" needed to break the deadly embrace when the ref count of the python object is down to 1, that is, down to when the only reference is the one from the Java parent wrapper.

When the finalizer proxy's refcount goes to zero, it is safe to assume that only Java _may_ still be needing the object. This is enough then to replace the strong global reference to the Java parent wrapper with a weak global reference thereby breaking the deadly embrace and letting Java garbage collect it when its time has come. When that time comes, the finalize() method on it is normally called by the Java garbage collector and the python ref count to the Python extension instance is brought to zero and the object is finally freed.

This assumes, of course, that when such an extension object is instantiated, the finalizer proxy is actually returned.

I should be able to implement this in C/C++ so that the performance hit is minimal and in a way that is transparent to PyLucene users.


I checked the implementation of this idea into svn trunk rev 381.
It is no longer necessary to call finalize() by hand :)

I removed the finalize() calls from test_PythonDirectory.py and test_Sort.py can now be run for ever, without any leakage.

It is necessary to rebuild both JCC and PyLucene to try this out.
I'd be curious to see if this solves your problem, Brian ?

Andi..
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev


      Forgot the famous last words? Access your message archive online at 
http://in.messenger.yahoo.com/webmessengerpromo.php
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to