>>I do have serious problems with the relevance of the results with fuzzy >>queries.
Please take the time to read my response here: http://www.gossamer-threads.com/lists/lucene/java-user/62050#62050 I had a work colleague come up with exactly the same problem this week and the solution is the same. Just tested my index with a standard Lucene FuzzyQuery for "Paul~" - this gives "Phul", "Saul", and "Paulo" before ANY "Paul" records due to IDF issues. Using FuzzyLikeThisQuery puts all the "Paul" records ahead of the variants. ----- Original Message ---- From: László Monda <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Cc: [EMAIL PROTECTED] Sent: Monday, 23 June, 2008 12:10:05 PM Subject: Re: Getting irrelevant results using fuzzy query On Wed, 2008-06-18 at 21:10 +0200, Daniel Naber wrote: > On Mittwoch, 18. Juni 2008, László Monda wrote: > > > Additional info: Lucene seems to do the right thing when only few > > documents are present, but goes crazy when there is about 1.5 million > > documents in the index. > > Lucene works well with more documents (currently using it with 9 million). > but the fuzzy query requires iteration over all terms which makes this > query slow. This can be avoid by setting the prefixLength parameter of the > FuzzyQuery constructor to 1 or 2. Or maybe you should use an n-gram index, > see the spellchecker in the contrib area. Thanks for the suggestion, but I don't have any performance problems yet, but I do have serious problems with the relevance of the results with fuzzy queries. -- Laci <http://monda.hu> __________________________________________________________ Sent from Yahoo! Mail. A Smarter Email http://uk.docs.yahoo.com/nowyoucan.html --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]