Adam:

I think you're worrying about the wrong thing. There is no "period of
unserviceability" to worry about in closing/reopening a searcher. If, by
saying "searcher", you mean the Lucene IndexSearcher/Reader. If you're
talking about shutting down your service, that's another story.

What you *do* have to think about is whether your service is single or
multi-threaded. If it's single-threaded, somewhere in your process, you have
something like

get request
service request
return response

So as part of, say, the service request part, you close/open a searcher. No
service unavailability here, perhaps a slight delay if you haven't warmed up
your searcher.

If you're multi-threaded, you probably have to make sure that all your
threads are waiting around when you close/open your searcher, since you
*should* be using a single (static) searcher across all your threads. Again,
no unserviceability, but perhaps a small delay depending upon how long it
takes all the threads to finish servicing their requests.....

All that said, it's less important than the fact that when you do reopen
your searcher, you'll get a slow response for that request only as the
searcher builds up its caches. You can avoid this by having your searcher
execute a query or two before you switch it in.

But I wouldn't worry about this until you demonstrate to your satisfaction
that there's actually a problem. Go with "the simplest thing that could
possibly work", analyze any problems, and fix it up. As you say, this index
really isn't very big. Not if it builds in 3 seconds. I rather doubt that
you have to do anything at all fancy to get satisfactory performance. But
what do I know <G>?

Ditto for worrying about FS or RAM based searcher. Don't bother trying the
RAM solution (as it's going to be more complex) until you know an FS based
index won't work. Especially since the FS based index largely *is* a RAM
based index when you consider caching. NOTE: when you're measuring things,
ignore the time it takes to service the *first* request since that'll be
misleading.

Anyway, hope all this helps
Erick

On 12/22/06, Adam Fleming <[EMAIL PROTECTED]> wrote:


Hi Patrik

Thanks for the thoughtful responses.  I am not a pro with Searchers yet,
but it seems like closing + opening searchers would still result in a small
period of unserviceability.  I would also like to stick to the Directory API
so that I can keep the option to use FS or RAM based indexes.

I think a slight extension to this idea may really do the trick.  It's as
follows:

1. Create Index A + Reader A.
2. set CurrentReader = A
3. After time interval T, build Index B + Reader B.
4. set CurrentReader = B
5. After time interval T, rebuild Index A + Reopen A
6. set CurrentReader = A

etc.

The advantage here being that the '=' operation is atomic and indivisible
- the currentReader variable always points to a valid and up-todate
index.  Although this system doesn't GUARANTEE there won't be a service
interruption, in practice if T is long enough there shouldn't be a problem.


Thoughts?  I'm curious if this solution reflects a misunderstanding about
the way Lucene works.


Thanks,

Adam



----------------------------------------
> Date: Wed, 20 Dec 2006 22:11:33 -0500
> From: [EMAIL PROTECTED]
> To: java-user@lucene.apache.org
> Subject: Re: Rebuilding index on a regular basis
>
> Hi,
>
> How about this:
>
> 1) You copy the files that make your index in a new folder
> 2) You update your index in that new folder (forcing if necessary, old
locks
> will not be valid)
> 3) When update is completed, close your readers, and open them on the
new
> index.
> 4) Copy the fresh index files to the previous location for next round,
where
> you won't need the initial copy to a fresh folder.
>
> That way, you won't have to reindex all your documents (assuming only a
> small subset needs updating) and will be able to switch to a more up to
date
> index more easily and often.
>
> Patrick
>
>
> On 12/20/06, Scott Sellman <[EMAIL PROTECTED]> wrote:
> >
> > Note: I have changed the title of this thread to match its content
> >
> > I am currently facing a similar issue.  I am dealing with a large
index
> > that is constantly used and needs to be updated on a daily basis.  For
> > fear of corruption I would rather rebuild the index each time,
> > performing tests against it before using it.  However the problem I am
> > having is switching in the old index without causing service
> > interruption.  As long as queries are being made against the index I
am
> > running into locking issues with the index files, preventing me from
> > putting the new index in place. Any suggestions?
> >
> > Thanks,
> > Scott
> >

_________________________________________________________________
Get the Live.com Holiday Page for recipes, gift-giving ideas, and more.
www.live.com/?addtemplate=holiday
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Reply via email to