On Fri, 2 Sep 2005, Boehm, Hans wrote:
Unfortunately, I'm neither a PyLucene nor a Python expert.
Simply put, PyLucene is a python wrapper around a gcj-based library built from
Java Lucene.
I suspect that people embedding PURE gcj-compiled java code into a non
java-based process would face the very same threading problems.
What does os.fork() do? It calls Posix fork()? Hopefully from
It seems that os.fork() just calls fork() which is declared in <process.h>
a single-threaded process? (A process forked from a multithreaded
parent can do very little other than call exec. Unfortunately,
libjava currently appears to violate that rule by calling
malloc. I think that's a bug.)
That could, in fact, explain a lot !
The reason people are playing these tricks is that they cannot always control
how threads are created in their process as, for example, when embedding their
stuff into apache via mod_python or other such facilities. Yet, libgcj's GC
component will not work with threads it didn't create, forcing users to resort
to forking processes and using IPC to submit them the libgcj/java work.
For what it's worth, in the past, I tried really hard to work around that
problem and got it to work on OS X only. On Windows, thread objects would leak
and on Linux it never worked with 2.6 kernels but appeared to work 'well
enough' with 2.4 kernels.
Eventually I gave up and turned the problem around as python doesn't care and
runs well with threads it created or not and told users to use PyLucene's
PythonThread class which is a subclass of python's threading.Thread class that
delegates to libgcj's java.lang.Thread the creation and startup of the actual
OS thread thereby having two thread objects one python, one java, for the very
same OS thread. This appears to work very well in environments where thread
creation can be controlled.
Andi..
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev