SOLR Performance Tuning: Fuzzy Search

Fuad Efendi Wed, 03 Feb 2010 10:29:27 -0800

I was lucky to contribute an excellent solution: 
http://issues.apache.org/jira/browse/LUCENE-2230


Even 2nd edition of Lucene in Action advocates to use fuzzy search only in
exceptional cases.


Another solution would be 2-step indexing (it may work for many use cases),
but it is not "spellchecker"

1. Create a regular index
2. Create a dictionary of terms
3. For each term, find nearest terms (for instance, stick with distance=2)
4. Use "copyField" in SOLR, or smth similar to synonym dictionary; or, for
instance, generate specific Query Parser...
5. Of course, custom request handler
and etc.

It may work well (but only if query contains term from dictionary; it can't
work as a spellchecker)

Combination 2 algos can boost performance extremely...


Fuad Efendi
+1 416-993-2060
http://www.linkedin.com/in/liferay

Tokenizer Inc.
http://www.tokenizer.ca/
Data Mining, Vertical Search

SOLR Performance Tuning: Fuzzy Search

Reply via email to