On Thu, Dec 18, 2003 at 11:40:32AM -0800, Doug Cutting wrote: > Dror Matalon wrote: > >There are two issues: > >1. Having new searches start using the new index only when it's ready, > >not in a "half baked" state, which means that you have to synchronize > >the switch from the old index to the new one. > > That's true. If you're doing updates (as opposed to just additions) > then you probably want to do something like:
At this point, we're just doing additions. > 1. keep a single open IndexReader used by all searches > 2. Every few minutes, process updates as follows: > a. open a second IndexReader > b. delete all documents that will be updated > c. close this IndexReader, to flush deletions > d. open an IndexWriter > e. add all documents that are updated > f. close the IndexReader > g. replace the IndexReader used for searches (1, above) > > >2. It's not trivial to figure out when it's safe to discard the old > >index; all existing searches are done with it. > > > >To make things more complicated, the Hits object is dependent on your > >IndexSearcher object, so if you have Hits objects in use you probably > >can't close your IndexSearcher. > > > >Is this a correct analysis or is there an obvious strategy to work > >around this issue? > > Right, you cannot safely close the IndexReader that's being used for > searching. Rather, just drop it on the floor and let it get garbage > collected. Its files will be closed when this happens. Provided you're > not updating more frequently than the garbage collector runs, you should > only ever have two IndexReaders open and shouldn't run into file handle > issues. Actually, rather than relying on the GC, my plan is to always have two indexed open, the current one and the previous one, and when I detect that it's time to switch, I then close the oldest index. Luckily in a web environment, we know that searches are not going to be around for several minutes, and I make sure that my indexes don't change more often than every 15 minutes. Thanks for the prompt and detailed feedback. Dror > > Doug > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > -- Dror Matalon Zapatec Inc 1700 MLK Way Berkeley, CA 94709 http://www.fastbuzz.com http://www.zapatec.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
