You can cross the new words against a dictionary and keep them in the file
as Jason described...

What Pradeep said is true, is always better to have "suggestions" related to
your index that have suggestions with no results...


On Mon, Oct 18, 2010 at 6:24 PM, Jason Blackerby <jblacke...@gmail.com>wrote:

> If you know the misspellings you could prevent them from being added to the
> dictionary with a StopFilterFactory like so:
>
>    <fieldType name="textSpell" class="solr.TextField"
> positionIncrementGap="100" >
>      <analyzer>
>        <tokenizer class="solr.StandardTokenizerFactory"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>        <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="misspelled_words.txt"/>
>        <filter class="solr.PatternReplaceFilterFactory" pattern="([^a-z])"
> replacement="" replace="all"/>
>        <filter class="solr.LengthFilterFactory" min="2" max="50"/>
>      </analyzer>
>    </fieldType>
>
> where misspelled_words.txt contains the misspellings.
>
> On Mon, Oct 18, 2010 at 5:14 PM, Pradeep Singh <pksing...@gmail.com>
> wrote:
>
> > I think a spellchecker based on your index has clear advantages. You can
> > spellcheck words specific to your domain which may not be available in an
> > outside dictionary. You can always dump the list from wordnet to get a
> > starter english dictionary.
> >
> > But then it also means that misspelled words from your domain become the
> > suggested correct word. Hmmm ... you'll need to have a way to prune out
> > such
> > words. Even then, your own domain based dictionary is a total go.
> >
> > On Mon, Oct 18, 2010 at 1:55 PM, Jonathan Rochkind <rochk...@jhu.edu>
> > wrote:
> >
> > > In general, the benefit of the built-in Solr spellcheck is that it can
> > use
> > > a dictionary based on your actual index.
> > >
> > > If you want to use some external API, you certainly can, in your actual
> > > client app -- but it doesn't really need to involve Solr at all
> anymore,
> > > does it?  Is there any benefit I'm not thinking of to doing that on the
> > solr
> > > side, instead of just in your client app?
> > >
> > > I think Yahoo (and maybe Microsoft?) have similar APIs with more
> generous
> > > ToSs, but I haven't looked in a while.
> > >
> > >
> > > Xin Li wrote:
> > >
> > >> Oops, never mind. Just read Google API policy. 1000 queries per day
> > limit
> > >> & for non-commercial use only.
> > >>
> > >>
> > >> -----Original Message-----
> > >> From: Xin Li Sent: Monday, October 18, 2010 3:43 PM
> > >> To: solr-user@lucene.apache.org
> > >> Subject: Spell checking question from a Solr novice
> > >>
> > >> Hi,
> > >> I am looking for a quick solution to improve a search engine's spell
> > >> checking performance. I was wondering if anyone tried to integrate
> > Google
> > >> SpellCheck API with Solr search engine (if possible). Google
> spellcheck
> > came
> > >> to my mind because of two reasons. First, it is costly to clean up the
> > data
> > >> to be used as spell check baseline. Secondly, google probably has the
> > most
> > >> complete set of misspelled search terms. That's why I would like to
> know
> > if
> > >> it is a feasible way to go.
> > >>
> > >> Thanks,
> > >> Xin
> > >> This electronic mail message contains information that (a) is or may
> be
> > >> CONFIDENTIAL, PROPRIETARY IN NATURE, OR OTHERWISE PROTECTED BY LAW
> FROM
> > >> DISCLOSURE, and (b) is intended only for the use of the
> > >> addressee(s) named herein.  If you are not an intended recipient,
> please
> > >> contact the sender immediately and take the steps necessary to delete
> > the
> > >> message completely from your computer system.
> > >>
> > >> Not Intended as a Substitute for a Writing: Notwithstanding the
> Uniform
> > >> Electronic Transaction Act or any other law of similar effect, absent
> an
> > >> express statement to the contrary, this e-mail message, its contents,
> > and
> > >> any attachments hereto are not intended to represent an offer or
> > acceptance
> > >> to enter into a contract and are not otherwise intended to bind this
> > sender,
> > >> barnesandnoble.com llc, barnesandnoble.com inc. or any other person
> or
> > >> entity.
> > >> This electronic mail message contains information that (a) is or may
> be
> > >> CONFIDENTIAL, PROPRIETARY IN NATURE, OR OTHERWISE PROTECTED BY LAW
> FROM
> > >> DISCLOSURE, and (b) is intended only for the use of the
> > >> addressee(s) named herein.  If you are not an intended recipient,
> please
> > >> contact the sender immediately and take the steps necessary to delete
> > the
> > >> message completely from your computer system.
> > >>
> > >> Not Intended as a Substitute for a Writing: Notwithstanding the
> Uniform
> > >> Electronic Transaction Act or any other law of similar effect, absent
> an
> > >> express statement to the contrary, this e-mail message, its contents,
> > and
> > >> any attachments hereto are not intended to represent an offer or
> > acceptance
> > >> to enter into a contract and are not otherwise intended to bind this
> > sender,
> > >> barnesandnoble.com llc, barnesandnoble.com inc. or any other person
> or
> > >> entity.
> > >>
> > >>
> > >
> >
>



-- 
______
Ezequiel.

Http://www.ironicnet.com

Reply via email to