Andi Vajda wrote: > > On Fri, 14 Jul 2006, David Fraser wrote: > >> Have been trying to research PyLucene under mod_python and the current >> threading problem. Found the following bug report: >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13212 - JNI/CNI >> AttachCurrentThread does not register thread with garbage collector >> which I presume is the underlying issue. >> There's a patch in that issue which apparently adds the necessary >> changes to be able to attach the GC to a native thread after creation. >> Would this be the right approach to enabling PyLucene to be able to work >> in standard Python threads? >> I presume we'd have to remember whether the current thread has been >> attached or not, and attach it if neccessary > > Until this is supported by the GC module in libgcj (an old bug, but > with a intent of making it work in the future, quoting Hans Boehm, its > author), the usual trick is to use PyLucene.PythonThread which is a > subclass of python's thread class which delegates the starting of the > underlying OS thread to libgcj. Right, that's what I was trying to ask above - is the bug I quoted the correct bug? If so, there seems to be a patch on it that if included would support not having to delegating the starting of the OS thread to gcj. > There are usually ways to customize the web service setup so that it > runs in instances of PyLucene.PythonThread. What they are for > mod_python, I don't know, but this is definitely the number one FAQ on > this list :) And so my above message is an answer to the FAQ - you can't customize mod_python to create threads differently because the threads are actually created by the Apache server without reference to mod_python (in fact they are created by the particular Multi-Processing Module used on that platform, but the point is it would require modifying Apache itself). I researched quite a lot and found the following alternatives: 1) Put the Python indexing code in another process, and talk to that (which most people are currently doing). An alternative here is to use http://incubator.apache.org/solr/ (or an equivalent) to handle the indexing (although its not Python) 2) Solve the GC issue as discussed above 3) mod_gcj (http://mod-gcj.sourceforge.net/about.html) tries to host gcj code in Apache in a separate process. Find some way of using this code to host PyLucene 4) Hook the Apache thread creation process - some Fedora people are apparently doing this by using a LD_PRELOAD environment variable, but it can have unexpected consequences. See https://www.redhat.com/archives/fedora-devel-java-list/2006-January/msg00002.html (and the discussion in the above bug)
Maybe I should write up the above (with more explanations) for the PyLucene web page / the OSAF wiki? David _______________________________________________ pylucene-dev mailing list [email protected] http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
