Hi, Firstly: PyLucene is great :)
I have a search server that processes search requests. It simply accepts a search string, performs the query and returns the hits using perspective broker - nothing special. The indexer is running 24/7, updating from content added to the system regularly. From my trial and error investigation, for my search server to remain up-to-date it has to reload the index regularly (say every 5 minutes) otherwise new documents will not be found. All of the searches results are sorted by a date field and I discovered that simply reloading the directory was not enough as the first query coming into the system experience a very large lag (60secs w/1million docs) for the first query. (I presume there's some loading into memory, but anyone have further details on this process?) My solution is to perform a "fake" search every time the search directory is reloaded, but this causes the search server to be unresponsive to all requests until the "fake" search is completed. Is there a better way to do this or a work around for this lag on the first search query? PS: To complicate things further I'm using twisted, so I'm unsure how threads could help... Cheers, Chris -- Web2.0 Developer http://www.chriswere.com/ _______________________________________________ pylucene-dev mailing list [email protected] http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
