On Mon, 30 Aug 2010, technology inspired wrote:

Thanks for the reply. My example runs fine when it runs alone (pure python).
Here is the code:

Ok, then the next step is to port it to a python http server such as [1] so that you get the threading and initialization story straight:
  - initVM() must be called from the main thread, once
  - any thread created from Python must call attachCurrentThread() before
    making any other calls that involve the JVM
I'm not sure how this is done in the apache2/wsgi environment, that is a question for another forum. That being said, if you solve this problem, posting your answer here would be helpful as this has come up before.

About the errors you're reporting, what you're seeing in your browser is irrelevant. Instead, you must log errors that happen on the Python side and look for these stacktraces there.

Andi..

[1] http://docs.python.org/library/simplehttpserver.html



#import sys, os
#sys.path.append("/home/v/workspace/example-project/src/trunk")
#os.environ['DJANGO_SETTINGS_MODULE'] = 'example.settings'
from lucene import Field, Document, initVM, NIOFSDirectory, IndexWriter,
StandardAnalyzer, Version, File
from lucene import SimpleFSLockFactory, NumericField, IndexSearcher,
QueryParser, NumericRangeQuery
from lucene import Integer, BooleanQuery, BooleanClause
#from django.shortcuts import render_to_response
def build():
    initVM()
    dir = NIOFSDirectory(File("/home/v/index"), SimpleFSLockFactory())
    analyzer = StandardAnalyzer(Version.LUCENE_30)
    writer = IndexWriter(dir, analyzer, True,
IndexWriter.MaxFieldLength(1024))

    field_rows = FieldDoc.objects.all() # Currently there is only one row in
database
    for row in field_rows:
        doc = Document()
        if row.category != "":
            doc.add(Field('category', row.category, Field.Store.YES,
Field.Index.NOT_ANALYZED))
            writer.addDocument(doc)

    writer.close()
    #return render_to_response("index.html", {"var": "Success"})

But when I connect it with httpd/mod_wsgi, I see the "Success" page some
times and other times, it says "Internal Server Error" with the errors as
mentioned in previous email. I am not aware what is the best practice to run
Python Lucene code from a web server.

You have mentioned about using attachCurrentThread(). I tried using it this
way:
env = initVM()
env.attachCurrentThread()

but no change in the response. I don't know if this is how
attachCurrentThread() should be used in above build function. Please guide
how to connect Lucene code with Apache2/wsgi. My apache2/wsgi is configured
properly as I can run non lucene coded web pages. Apache2 is using
mpm-worker, a threaded environment.

Thanks.

Regards,
Vin



On Sun, Aug 29, 2010 at 12:21 PM, Andi Vajda <[email protected]> wrote:

      On Sun, 29 Aug 2010, technology inspired wrote:

            I am using PyLucene 3.0.2 on Ubuntu 10.04 with
            Python 2.6.5 and Sun Java
            1.6. I am written an example script to build index
            and store in a directory.
            Later on, I want it to search in my next example
            script which as of now I
            haven't written.

            There are two issues I have to mention and looking
            for your help:

            ISSUE 1:
            I am using Apache2 with mod_wsgi 3.3. I have got the
            index building script
            connected to a GET request. When I call that GET
            request, I get following
            errors:

            [error] [client 127.0.0.1] Premature end of script
            headers: wsgi
            [notice] child pid exit signal Aborted (6).

            With this error, I see "Internal Server Error" on my
            browser screen. This
            error appears only if I make GET request very often,
            i.e. around 1 per 2
            seconds. If I issue GET at the interval of 10
            seconds, I don't see these
            errors.

            ISSUE 2:
            When I index Date field using NumericField, the GET
            request gives "Internal
            Server Error" on every alternate request. and the
            Apache2 log files gets
            these errors:
            [error] [client 127.0.0.1] Premature end of script
            headers: wsgi
            [notice] child pid exit signal Segmentation fault
            (11)

            I am looking for help to solve these problems. I am
            running WSGI deamon
            mode. WSGI settings are:
            ...
            WSGIDaemonProcess example.com user=www-data
            group-www-data thread 25
            WSGIProcessGroup example.com
            WSGIScriptAlias /
            /home/user1/workspace/http_wsgi/wsgi
            ...

            So do guide how to enable PyLucene based codes
            running from Apache2 mod_wsgi
            (searching, indexing etc).


First, get your application to work outside of apache2/wsgi, as a
plain Python program. Then, once it's debugged, adapt it to the
apache2/wsgi environment. And, last but not least, if you are using
threads, be sure to call attachCurrentThread() [1] before calling into
the JVM.

Andi..

[1]
http://lucene.apache.org/pylucene/jcc/documentation/readme.html#api



Reply via email to