On Saturday 24 November 2007 18:48:18 Mathieu Lecarme wrote: > fuzzy are simply not indexed. > If you wont to search quickly with fuzzy search, you should index word > and their ngrams, it's the "do you mean" pattern.
replacing fuzzy with "did you mean" is indeed my favourite option however so far I don't know how to do it (in my case). Are there some examples to look at? I think one of the problem with a fuzzy queries is that it searches for all terms that match the given levenstein distance. I doesn't care whether a particular term might be in a document or field that I'm not interested in at all. > you first select used word wich share ngram with the query word, the > distance is computed with levenstein, and you use this word as a > synonym. > > M. > > Le 24 nov. 07 à 17:36, Timo Nentwig a écrit : > > Hi! > > > > I search an 1.5 gig index and fuzzy queries are really slow; > > something like > > avg. ~500ms (IndexSearcher.search(Query, HitCollector)). > > > > When performing exact queries I archieve response times <25ms. What > > is it that > > makes fuzzy queries so slow? Increased index access due to more > > terms, i.e. > > disk IO? > > > > And no, my fuzzy queries (fuzzy factor 0.8) don't blow up to a > > boolean query > > with 100s clauses but maybe something...less than 10. > > > > Thanks > > Timo > > > > P.S. arent' there any "best practices" for lucene? Does everybody > > have to find > > out on his own (over and over again) and spend a lot of time reading > > and > > understanding lucene's code base? > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]