Best practice for Fuzzy Search combined with Phrase Queries

2016-10-06 Thread Markus Lang
Hi,
I am interested in best practices on how to handle phrase queries where
only a part of the phrase may match and / or the user made some typos.
Are there any papers on when to use only a part of the query phrase or how
many words of the phrase should rather be corrected before skipping them?
Does anyone know how e.g. Google or Amazon deals with these issues?

Best regards

Markus


Solr Spellchecker with weighted Dictionary

2016-09-28 Thread Markus Lang
Hello,
is it possible to configure the Solr Spellchecker Component to use a
weighted dictionary?

At the moment I use the solr.SuggestComponent because it provides the
possibility to use a "sourceLocation" with weights for each entry. But this
leads me to a different problem:

e.g. I have the following entries in my sourceLocation:

berlin 14
berlin wall 42
london bridge 32

and the user uses the query "berlin" then I get the suggestion "berlin
wall" because of the higher weight of this entry. But in this case "berlin"
would be perfectly fine query on it's own, since I actually don't want to
suggest a completion but simpy a spell correction.
On the other hand, if the user queries for "london" or "lundon" it is
intended to suggest "london bridge", since the word "london" alone is not
in the sourceLocation.


At the moment I use this configuration:


  

  didYouMean
  org.apache.solr.spelling.suggest.fst.FuzzyLookupFactory
  org.apache.solr.spelling.suggest.FileDictionaryFactory
  conf/queries.txt
  \t
  didYouMeanCache
  string
  true
  true
  
  2
  1
  true

  

  

  true
  true
  5



  didYouMean

  


Thanks and best Regards
Markus