I'll just add my 0.1€ cents to Andi's post...
I've been using PyLucene and mod_python for quite some time. On Linux
this is a good combination:
- I you, like me, have to switch between hundreds of indexes opening
and closing them for each search operation then speed may become an
issue.
(searching is fast, opening and closing each index is not - I would
be very interested in hearing how ReiserFS performs...)
- If you on the other hand will only search one index then this setup
will perform admirably.
- To prevent memory leaks ,which will happen, set your apache server
to reap it's children after a given number of requests. 125 requests
works for me.
- On Linux you should take care to set " os.environ['GCJ_PROPERTIES']
= "disableLuceneLocks=true" ".
regards
/rune
On 21. aug. 2005, at 01.14, Andi Vajda wrote:
1) Would pyLucene + Apache2/mod_python running on Debian Stable on
a P4 server
with 2gb ram would be a good direction to go in from a performance
and
stability standpoint?
Some people have reported memory leaks on some Linuxes (only). I've
been unable to reproduce or track them down. I suspect some gcj
issue that is specific to a given gcj and linux version pair as no
such problems have been reported on Mac OS X or Windows.
There are also issues with using PyLucene with mod_python having to
do with threading. These have been discussed on this list before,
please refer to the archives for more info.
We're talking on the order of 50-100 searches per user per day,
with the
concentration being during the eight hour workday window. To be
liberal,
let's say we're looking about 4000 searches during the workday and
maybe 500
or so outside that time window. Will pyLucene be sufficient?
My guess is yes. Some people on this list have developed similar
applications to yours. May they chime in !
2) Using pyLucene on Linux, are there filesystem considerations?
I've become
a huge fan of ReiserFS on my mail servers, where I'm using
Maildirs -- the
performance is noticably better than Ext3. I'm guessing ResierFS
would have
similarly happy results for pyLucene (or regular old Lucene).
Anyone know
about this?
The files created by PyLucene are the same as with Java Lucene. As
a matter of fact, they should even be compatible with each other
(using the same versions of Lucene, 1.4.3 at the moment). ReiserFS
is good at having many small files in a given directory, not
typically the case with a Lucene index. For more information, you
should ask the [email protected] mailing list where
general Lucene usage issues are much better covered than here.
3) I've been looking through the archives -- I'll make sure to
compile
pyLucene with the latest gcj!
If you see a pattern with the leaks reported on Linux, let us know.
Andi..
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
Happy those, who can remain at Highbury!
Jane Austen (1775-1817)
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev