After upgrading from PyLucene 2.4.1 to the 2.9 series, I've started hitting OutOfMemory errors, which I haven't seen before with PyLucene. I'm on a machine with lots of RAM and am indexing only about 200K docs, so it seems odd. It seems to happen when I've serially generated >500 short-lived java-attached threads (I do the detach after the thread has run). Each thread typically does some searches and indexes a document before it expires, and before the next thread is created.
I use several custom extensions, mainly a StringSet subclass of lucene.PythonSet, and two subclasses of lucene.PythonMultiFieldQueryParser. Looking back at the correspondence from Jan 2008, I don't believe I have to do anything special to make sure these are GC'ed. I'll see if I can produce a simplified example for you. Is there anything else I should look at? I'm currently running PyLucene 2.9.3 on OS X 10.5.8 with Python 2.5 and the latest Apple Java distro. Bill