Re: Why exactly are fuzzy queries so slow?

Timo Nentwig Sun, 25 Nov 2007 01:42:28 -0800

On Saturday 24 November 2007 18:28:48 markharw00d wrote:
> The added IO is one factor. Another is the CPU load from doing many
> edit-distance comparisons between index terms and the provided search


You mean FuzzyQuery.rewrite(). Are you sure this is a CPU and not an IO issue 
(reading the terms from the index)? I thought about introducing a cache here 
because the fuzzy factory remains constant.

> term. You can limit the number of edit distance comparisons conducted by
> setting the minimum prefix length. This is a property of the QueryParser
> if parsing queries and a property of FuzzyQuery if constructing
> programatically.
>
>  >>P.S. arent' there any "best practices" for lucene?
>
> Yes. The front page of the the WIKI has a "Basics of Performace". I've
> added a note on FuzzyQueries to the "ImproveSearchingSpeed" page

Thanks! I didn't know there's a wiki...

I heard rumors that I am supposed to use a single IndexSearcher. Now I read 
this on the mentioned page. Why? In fact I pool IndexSearchers (and therefore 
IndexReaders) because I ran in a synchronized() bottleneck when doing IO.

I also heard that this problem would decrease with an MMapDirectory however so 
far I wasn't able to get my hands on a proper 64bit machine...

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Why exactly are fuzzy queries so slow?

Reply via email to