On Fri, 2 Sep 2005, Boehm, Hans wrote:

Unfortunately, I'm neither a PyLucene nor a Python expert.

Simply put, PyLucene is a python wrapper around a gcj-based library built from Java Lucene.

I suspect that people embedding PURE gcj-compiled java code into a non java-based process would face the very same threading problems.

What does os.fork() do?  It calls Posix fork()?  Hopefully from

It seems that os.fork() just calls fork() which is declared in <process.h>

a single-threaded process?  (A process forked from a multithreaded
parent can do very little other than call exec.  Unfortunately,
libjava currently appears to violate that rule by calling
malloc.  I think that's a bug.)

That could, in fact, explain a lot !

The reason people are playing these tricks is that they cannot always control how threads are created in their process as, for example, when embedding their stuff into apache via mod_python or other such facilities. Yet, libgcj's GC component will not work with threads it didn't create, forcing users to resort to forking processes and using IPC to submit them the libgcj/java work.

For what it's worth, in the past, I tried really hard to work around that problem and got it to work on OS X only. On Windows, thread objects would leak and on Linux it never worked with 2.6 kernels but appeared to work 'well enough' with 2.4 kernels.

Eventually I gave up and turned the problem around as python doesn't care and runs well with threads it created or not and told users to use PyLucene's PythonThread class which is a subclass of python's threading.Thread class that delegates to libgcj's java.lang.Thread the creation and startup of the actual OS thread thereby having two thread objects one python, one java, for the very same OS thread. This appears to work very well in environments where thread creation can be controlled.

Andi..
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to