On 9/7/06, Neville Burnell <[EMAIL PROTECTED]> wrote:
> Thanks for your email Dave,
>
> I've thought about this overnight, and I've got a few questions please.
>
> > When you open an IndexReader on the index it is opened up on
> > that particular version (or state) of the index
>
> Would you elaborate on how Ferret manages versions please. For example,
> can I have two readers open, one which accesses the old version of the
> index, and the second which accesses the latest version?

When  you open an IndexReader it opens all the files that it needs to
read the index and it keeps all of the file handles. Even after the
index is updated and those files are deleted they are not actually
freed by the operating system. If you then open an IndexReader on a
later version it holds file handles to all the files needed for that
version. So the answer is yes, you can have multiple IndexReaders open
on an index at the same time, all reading different versions. Each
version of the index has an internal version number and there is an
IndexReader#latest? method to determine if the version of the index
that you are reading is the current version.

> > So to keep searches up to date you need to close and reopen
> > your IndexReader every time you commit changes to the index.
>
> I guess by reopen you mean IndexReader.new ?

That's correct. Don't forget to close the old IndexReader. That
garbage collector will do this for you but IndexReaders hold a lot of
resources so it's best to close them as soon as you no longer need
them.

> I proceeded to replace my Index usage with an IndexReader and Searcher
> which are closed and recreated after each IndexWriter pass, and the
> result seems to be that searches are still serialised - ie, a long
> running query on thread t1 "blocks" the normally very fast query on
> thread t1.
>
> Might I be seeing another point of synchonisation, or am I just
> observing a characteristic of ruby threads ?

I think it's probably a symptom of using ruby threads. I don't think
they can swap threads in the middle of a call to a C function. It's
unusual, however for a search to take long enough to be a problem
though. What kind of search is it? If it's a PrefixQuery, FuzzyQuery
or WildCardQuery you'll get much better performance on an optimized
index. If you are making heavy use of any of these queries it is the
one time I'd recommend always keeping the index in an optimized state.

cheers,
Dave
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Reply via email to