Thank you for the information. I've currently using the fuzzy search and set the edit distance value to ~0.79, and this has allowed a 20% error rate. (ie for words with 5 characters, it allows 1 mis-spelled character, and for words with 10 characters, it allows 2 mis-speed characters).
However, for words with 4 characters, I'll need to set the value to ~0.75 to allow 1 mis-spelled character, as in order to accommodate 4 characters word, it requires a 25% error rate for 1 mis-spelled character. We probably will not accommodate for 3 characters word. I've gotten the information from here: http://lucene.apache.org/core/3_6_0/queryparsersyntax.html#Fuzzy%20Searches <http://mail.growhill.com/cgi-bin/webmanager/webmail.cgi?cmd=url&xdata=~2-dd4639fc876fef5244efd32efa438fb90296a3eadadba2c6d7ce00&url=http!3A!2F!2Flucene.apache.org!2Fcore!2F3_6_0!2Fqueryparsersyntax.html!23Fuzzy!2520Searches> Just to check, will this affect the performance of the system? Regards, Edwin On 7 May 2015 at 20:00, Alessandro Benedetti <benedetti.ale...@gmail.com> wrote: > Hi ! > Currently Solr builds FST to provide proper fuzzy search or spellcheck > suggestions based on the string distance . > The current default algorithm is the Levenstein distance ( that returns the > number of edit as distance metric). > In your case you should calculate client side, the edit you want to apply > to your search. > In your client code, should be not difficult to process the query and apply > the proper number of edit depending on the length. > > Anyway the max edit for the levenstein default distance is fixed to 2 . > > Cheers > > > > 2015-05-05 10:24 GMT+01:00 Zheng Lin Edwin Yeo <edwinye...@gmail.com>: > > > Hi, > > > > Would like to check, how do we implement character proximity searching > > that's in terms of percentage with regards to the length of the word, > > instead of a fixed number of edit distance (characters)? > > > > For example, if we have a proximity of 20%, a word with 5 characters will > > have an edit distance of 1, and a word with 10 characters will > > automatically have an edit distance of 2. > > > > Will Solr be able to do that for us? > > > > Regards, > > Edwin > > > > > > -- > -------------------------- > > Benedetti Alessandro > Visiting card : http://about.me/alessandro_benedetti > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symmetry?" > > William Blake - Songs of Experience -1794 England >