Andi Vajda wrote:
>
> On Fri, 14 Jul 2006, David Fraser wrote:
>
>> Have been trying to research PyLucene under mod_python and the current
>> threading problem. Found the following bug report:
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13212 - JNI/CNI
>> AttachCurrentThread does not register thread with garbage collector
>> which I presume is the underlying issue.
>> There's a patch in that issue which apparently adds the necessary
>> changes to be able to attach the GC to a native thread after creation.
>> Would this be the right approach to enabling PyLucene to be able to work
>> in standard Python threads?
>> I presume we'd have to remember whether the current thread has been
>> attached or not, and attach it if neccessary
>
> Until this is supported by the GC module in libgcj (an old bug, but
> with a intent of making it work in the future, quoting Hans Boehm, its
> author), the usual trick is to use PyLucene.PythonThread which is a
> subclass of python's thread class which delegates the starting of the
> underlying OS thread to libgcj.
Right, that's what I was trying to ask above - is the bug I quoted the
correct bug? If so, there seems to be a patch on it that if included
would support not having to delegating the starting of the OS thread to gcj.
> There are usually ways to customize the web service setup so that it
> runs in instances of PyLucene.PythonThread. What they are for
> mod_python, I don't know, but this is definitely the number one FAQ on
> this list :)
And so my above message is an answer to the FAQ - you can't customize
mod_python to create threads differently because the threads are
actually created by the Apache server without reference to mod_python
(in fact they are created by the particular Multi-Processing Module used
on that platform, but the point is it would require modifying Apache
itself).
I researched quite a lot and found the following alternatives:
1) Put the Python indexing code in another process, and talk to that
(which most people are currently doing). An alternative here is to use
http://incubator.apache.org/solr/ (or an equivalent) to handle the
indexing (although its not Python)
2) Solve the GC issue as discussed above
3) mod_gcj (http://mod-gcj.sourceforge.net/about.html) tries to host gcj
code in Apache in a separate process. Find some way of using this code
to host PyLucene
4) Hook the Apache thread creation process - some Fedora people are
apparently doing this by using a LD_PRELOAD environment variable, but it
can have unexpected consequences. See
https://www.redhat.com/archives/fedora-devel-java-list/2006-January/msg00002.html
(and the discussion in the above bug)

Maybe I should write up the above (with more explanations) for the
PyLucene web page / the OSAF wiki?

David
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to