You might want to look at SOLR-2010. This patch works with the "collation" feature, having it test the collations it returns to ensure they'll return hits. So if a user types "san jos" it will know that the combination "san jose" is in the index and "san ojos" is not.
James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -----Original Message----- From: Savannah Beckett [mailto:savannah_becket...@yahoo.com] Sent: Monday, September 27, 2010 7:45 PM To: solr-user@lucene.apache.org Cc: erickerick...@gmail.com Subject: Re: Need help with spellcheck city name No, I checked, there is a city called Swan in Iowa. So, it is getting from the city index, so is Clerk. But why does it favor Swan than San? Spellcheck get weird after I treat city name as one token. If I do it in the old way, it let San go, and correct Jos as Ojos instead of Jose because Ojos is ranked as #1 and Jose at the middle. Any more suggestions? Rank it by frequency first then score doesn't work neither. ________________________________ From: Erick Erickson <erickerick...@gmail.com> To: solr-user@lucene.apache.org Sent: Mon, September 27, 2010 5:24:25 PM Subject: Re: Need help with spellcheck city name Hmmm, did you rebuild your spelling index after the config changes? And it really looks like somehow you're getting results from a field other than city. Are you also sure that your cityname field is of type autocomplete1? Shooting in the dark here, but these results are so weird that I suspect it's something fundamental.... Best Erick On Mon, Sep 27, 2010 at 8:05 PM, Savannah Beckett < savannah_becket...@yahoo.com> wrote: > No, it doesn't work, I got weird result. I set my city name field to be > parsed > as a token as following: > > <fieldType name="autocomplete1" class="solr.TextField" > positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > </fieldType> > > I got following result for spellcheck: > > <lstname="spellcheck"> > - <lstname="suggestions"> > - <lstname="san"> > <intname="numFound">1</int> > <intname="startOffset">0</int> > <intname="endOffset">3</int> > - <arrname="suggestion"> > <str>swan</str> > </arr> > </lst> > - <lstname="clar"> > <intname="numFound">1</int> > <intname="startOffset">4</int> > <intname="endOffset">8</int> > <arrname="suggestion"> > <str>clark</str> > </arr> > </lst> > </lst> > > > > > > ________________________________ > From: Tom Hill <solr-l...@worldware.com> > To: solr-user@lucene.apache.org > Sent: Mon, September 27, 2010 3:52:48 PM > Subject: Re: Need help with spellcheck city name > > Maybe process the city name as a single token? > > On Mon, Sep 27, 2010 at 3:25 PM, Savannah Beckett > <savannah_becket...@yahoo.com> wrote: > > Hi, > > I have city name as a text field, and I want to do spellcheck on it. I > use > > setting in http://wiki.apache.org/solr/SpellCheckComponent > > > > If I setup city name as text field and do spell check on "San Jos" for > San > >Jose, > > I get suggestion for Jos as "ojos". I checked the extendedresult and I > found > > that Jose is in the middle of all 10 suggestions in term of score and > > frequency. I then set city name as string field, and spell check again, > I got > > Van for San and Ross for Jos, which is weird because San is correct. > > > > > > How do you setup spellchecker to spellcheck city names? City name can > have > > multiple words. > > Thanks. > > > > > > > > > > >