Re: Swapping Indexes?
Forward back to list. -- Forwarded message -- From: Patrick Burleson [EMAIL PROTECTED] Date: Tue, 17 Aug 2004 11:30:19 -0400 Subject: Re: Swapping Indexes? To: Stephane James Vaucher [EMAIL PROTECTED] Stephane, Thank you for the ideas. I'm going about implenting idea 1 (I like the idea of leaving the temp index around for recovery), but I have a question reguarding your original index. Do you just copy over the temp index and don't worry abou cleaning up the old index directory? Right now I have my code deleting the files in the main index directory after telling the search controller to switch to the temp index. But by doing that, I need to manage existing searches and not break them while they are running. I also still run into the open files problem on Windows when trying to delete a file one of the searchers has open before it's closed. Thoughts? Patrick On Mon, 16 Aug 2004 18:22:20 -0400 (EDT), Stephane James Vaucher [EMAIL PROTECTED] wrote: I've tried two options that seem to work: 1) Have a singleton that is responsible that will control your searchers. This controller can temporarilly redirect your searchers to c:/temp/myindex, allowing you to copy you index to c:/myindex. After that process completes, your controller can tell your searchers to use c:/myindex, allowing you to then erase your temp index. If you index nightly, you can always *not* erase your tmp dir, your index process will do this automatically if you create your IndexWriter with the overwrite option. This way, you can have a backup index if there is a system failure at some point (like when you copy/move directories). 2) Use an incremental index. Regularly, I scan my files, see if there are modification/additions and update my master index. Removing from the master index, adding to a temp dir, then merging. I haven't seen any weirdness on windows with this process. HTH, sv On Mon, 16 Aug 2004, Patrick Burleson wrote: I've read in the docs about updating an index and its suggestion reguarding swapping out indexes with a directory rename. Here's my question, how to do this when searches are running live? Say I have a directory that holds the current valid index: C:\myindex and when I'm running my nightly process to generate the index, it gets temporarily indexed to: C:\temp\myindex How can I very quickly replace C:\myindex with C:\temp\myindex? I can't simply do a rename since C:\myindex will likely have open files. (Gotta love windows) And I can't delete all files in myindex, again because of the open files issue. Any ideas? Thanks, Patrick - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Swapping Indexes?
On Tue, 17 Aug 2004, Patrick Burleson wrote: Forward back to list. -- Forwarded message -- From: Patrick Burleson [EMAIL PROTECTED] Date: Tue, 17 Aug 2004 11:30:19 -0400 Subject: Re: Swapping Indexes? To: Stephane James Vaucher [EMAIL PROTECTED] Stephane, Thank you for the ideas. I'm going about implenting idea 1 (I like the idea of leaving the temp index around for recovery), but I have a question reguarding your original index. Do you just copy over the temp index and don't worry abou cleaning up the old index directory? Actually, I use a IndexWriter in overwrite mode on the master dir and merge the temp dir. This cleans up the old master. Right now I have my code deleting the files in the main index directory after telling the search controller to switch to the temp index. But by doing that, I need to manage existing searches and not break them while they are running. I also still run into the open files problem on Windows when trying to delete a file one of the searchers has open before it's closed. I used to way some time (~1 minute) for all searches on the old master to finish after redirecting to the temp dir, then I would switch to the new master. Thoughts? If you apply a lease-like contract with your searchers where they borrow a reference to a searcher and then hand it back to the manager, you can probably trace your open files. HTH, sv Patrick On Mon, 16 Aug 2004 18:22:20 -0400 (EDT), Stephane James Vaucher [EMAIL PROTECTED] wrote: I've tried two options that seem to work: 1) Have a singleton that is responsible that will control your searchers. This controller can temporarilly redirect your searchers to c:/temp/myindex, allowing you to copy you index to c:/myindex. After that process completes, your controller can tell your searchers to use c:/myindex, allowing you to then erase your temp index. If you index nightly, you can always *not* erase your tmp dir, your index process will do this automatically if you create your IndexWriter with the overwrite option. This way, you can have a backup index if there is a system failure at some point (like when you copy/move directories). 2) Use an incremental index. Regularly, I scan my files, see if there are modification/additions and update my master index. Removing from the master index, adding to a temp dir, then merging. I haven't seen any weirdness on windows with this process. HTH, sv On Mon, 16 Aug 2004, Patrick Burleson wrote: I've read in the docs about updating an index and its suggestion reguarding swapping out indexes with a directory rename. Here's my question, how to do this when searches are running live? Say I have a directory that holds the current valid index: C:\myindex and when I'm running my nightly process to generate the index, it gets temporarily indexed to: C:\temp\myindex How can I very quickly replace C:\myindex with C:\temp\myindex? I can't simply do a rename since C:\myindex will likely have open files. (Gotta love windows) And I can't delete all files in myindex, again because of the open files issue. Any ideas? Thanks, Patrick - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Swapping Indexes?
On Tue, 17 Aug 2004 13:17:10 -0400 (EDT), Stephane James Vaucher Actually, I use a IndexWriter in overwrite mode on the master dir and merge the temp dir. This cleans up the old master. I'm a bit of a Lucene newbie here, and I am trying to understand what you mean by merge the temp dir? Do you copy your exiting Index to the temp location, then use the overwrite feature of IndexWriter to re-create the master, then what do you merge? Shouldn't the master index now have everything? I used to way some time (~1 minute) for all searches on the old master to finish after redirecting to the temp dir, then I would switch to the new master. I'm going to make this a setting, so that test won't have to wait a whole minute. But I think this is the cleanest solution without having to implement some sort of leaseing solution. Our searches should be fast and 1 minute is a long time. They should all be done by then. Thanks again, Patrick - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Swapping Indexes?
On Tue, 17 Aug 2004, Patrick Burleson wrote: On Tue, 17 Aug 2004 13:17:10 -0400 (EDT), Stephane James Vaucher Actually, I use a IndexWriter in overwrite mode on the master dir and merge the temp dir. This cleans up the old master. I'm a bit of a Lucene newbie here, and I am trying to understand what you mean by merge the temp dir? IndexWriter.addIndexes() Do you copy your exiting Index to the temp location, then use the overwrite feature of IndexWriter to re-create the master, then what do you merge? Shouldn't the master index now have everything? What I mean is the following: 1) create tmp dir 2) redirect searchers to tmp dir 3) wait for everyone to use tmp dir (or other mecanism) 4) open indexwriter on master dir erasing it 5) merge tmp directory, using addIndexes() method 6) redirect searchers to new master dir I used to way some time (~1 minute) for all searches on the old master to finish after redirecting to the temp dir, then I would switch to the new master. I'm going to make this a setting, so that test won't have to wait a whole minute. But I think this is the cleanest solution without having to implement some sort of leaseing solution. Our searches should be fast and 1 minute is a long time. They should all be done by then. I used to reindex all my docs at 5h00AM, I probably could have waited 10 minutes since I didn't have users, it's all about requirements ;) Thanks again, Patrick sv - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Swapping Indexes?
I've tried two options that seem to work: 1) Have a singleton that is responsible that will control your searchers. This controller can temporarilly redirect your searchers to c:/temp/myindex, allowing you to copy you index to c:/myindex. After that process completes, your controller can tell your searchers to use c:/myindex, allowing you to then erase your temp index. If you index nightly, you can always *not* erase your tmp dir, your index process will do this automatically if you create your IndexWriter with the overwrite option. This way, you can have a backup index if there is a system failure at some point (like when you copy/move directories). 2) Use an incremental index. Regularly, I scan my files, see if there are modification/additions and update my master index. Removing from the master index, adding to a temp dir, then merging. I haven't seen any weirdness on windows with this process. HTH, sv On Mon, 16 Aug 2004, Patrick Burleson wrote: I've read in the docs about updating an index and its suggestion reguarding swapping out indexes with a directory rename. Here's my question, how to do this when searches are running live? Say I have a directory that holds the current valid index: C:\myindex and when I'm running my nightly process to generate the index, it gets temporarily indexed to: C:\temp\myindex How can I very quickly replace C:\myindex with C:\temp\myindex? I can't simply do a rename since C:\myindex will likely have open files. (Gotta love windows) And I can't delete all files in myindex, again because of the open files issue. Any ideas? Thanks, Patrick - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]