I'll just add my 0.1€ cents to Andi's post...

I've been using PyLucene and mod_python for quite some time. On Linux this is a good combination: - I you, like me, have to switch between hundreds of indexes opening and closing them for each search operation then speed may become an issue. (searching is fast, opening and closing each index is not - I would be very interested in hearing how ReiserFS performs...) - If you on the other hand will only search one index then this setup will perform admirably. - To prevent memory leaks ,which will happen, set your apache server to reap it's children after a given number of requests. 125 requests works for me. - On Linux you should take care to set " os.environ['GCJ_PROPERTIES'] = "disableLuceneLocks=true" ".

regards
/rune

On 21. aug. 2005, at 01.14, Andi Vajda wrote:



1) Would pyLucene + Apache2/mod_python running on Debian Stable on a P4 server with 2gb ram would be a good direction to go in from a performance and
stability standpoint?


Some people have reported memory leaks on some Linuxes (only). I've been unable to reproduce or track them down. I suspect some gcj issue that is specific to a given gcj and linux version pair as no such problems have been reported on Mac OS X or Windows.

There are also issues with using PyLucene with mod_python having to do with threading. These have been discussed on this list before, please refer to the archives for more info.


We're talking on the order of 50-100 searches per user per day, with the concentration being during the eight hour workday window. To be liberal, let's say we're looking about 4000 searches during the workday and maybe 500
or so outside that time window.  Will pyLucene be sufficient?


My guess is yes. Some people on this list have developed similar applications to yours. May they chime in !


2) Using pyLucene on Linux, are there filesystem considerations? I've become a huge fan of ReiserFS on my mail servers, where I'm using Maildirs -- the performance is noticably better than Ext3. I'm guessing ResierFS would have similarly happy results for pyLucene (or regular old Lucene). Anyone know
about this?


The files created by PyLucene are the same as with Java Lucene. As a matter of fact, they should even be compatible with each other (using the same versions of Lucene, 1.4.3 at the moment). ReiserFS is good at having many small files in a given directory, not typically the case with a Lucene index. For more information, you should ask the [email protected] mailing list where general Lucene usage issues are much better covered than here.


3) I've been looking through the archives -- I'll make sure to compile
pyLucene with the latest gcj!


If you see a pattern with the leaks reported on Linux, let us know.

Andi..
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev


Happy those, who can remain at Highbury!
Jane Austen (1775-1817)


_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to