Hi,

Firstly: PyLucene is great :)

I have a search server that processes search requests. It simply
accepts a search string, performs the query and returns the hits using
perspective broker - nothing special.

The indexer is running 24/7, updating from content added to the system
regularly. From my trial and error investigation, for my search server
to remain up-to-date it has to reload the index regularly (say every 5
minutes) otherwise new documents will not be found.

All of the searches results are sorted by a date field and I
discovered that simply reloading the directory was not enough as the
first query coming into the system experience a very large lag (60secs
w/1million docs) for the first query. (I presume there's some loading
into memory, but anyone have further details on this process?)

My solution is to perform a "fake" search every time the search
directory is reloaded, but this causes the search server to be
unresponsive to all requests until the "fake" search is completed.

Is there a better way to do this or a work around for this lag on the
first search query?

PS: To complicate things further I'm using twisted, so I'm unsure how
threads could help...

Cheers,
Chris


--

Web2.0 Developer
http://www.chriswere.com/
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to