1) Would pyLucene + Apache2/mod_python running on Debian Stable on a P4 server with 2gb ram would be a good direction to go in from a performance and stability standpoint?
Some people have reported memory leaks on some Linuxes (only). I've been unable to reproduce or track them down. I suspect some gcj issue that is specific to a given gcj and linux version pair as no such problems have been reported on Mac OS X or Windows.
There are also issues with using PyLucene with mod_python having to do with threading. These have been discussed on this list before, please refer to the archives for more info.
We're talking on the order of 50-100 searches per user per day, with the concentration being during the eight hour workday window. To be liberal, let's say we're looking about 4000 searches during the workday and maybe 500 or so outside that time window. Will pyLucene be sufficient?
My guess is yes. Some people on this list have developed similar applications to yours. May they chime in !
2) Using pyLucene on Linux, are there filesystem considerations? I've become a huge fan of ReiserFS on my mail servers, where I'm using Maildirs -- the performance is noticably better than Ext3. I'm guessing ResierFS would have similarly happy results for pyLucene (or regular old Lucene). Anyone know about this?
The files created by PyLucene are the same as with Java Lucene. As a matter of fact, they should even be compatible with each other (using the same versions of Lucene, 1.4.3 at the moment). ReiserFS is good at having many small files in a given directory, not typically the case with a Lucene index. For more information, you should ask the [email protected] mailing list where general Lucene usage issues are much better covered than here.
3) I've been looking through the archives -- I'll make sure to compile pyLucene with the latest gcj!
If you see a pattern with the leaks reported on Linux, let us know. Andi.. _______________________________________________ pylucene-dev mailing list [email protected] http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
