"Yonik's Law of Patches" reads: "A half-baked patch in Jira, with no
documentation, no tests and no backwards compatibilty is better than no
patch at all."

It'd be perfectly appropriate, IMO, for you to post an outline of what your
enhancements do over on the SOLR dev list and get a reaction from the folks
over there as to whether it should be a Jira or not... see
solr-...@lucene.apache.org

Best
Erick

On Tue, Jul 27, 2010 at 2:04 PM, Mark Holland <mark.holl...@zoopla.co.uk>wrote:

> Hi,
>
> I found the suggestions returned from the standard solr spellcheck not to
> be
> that relevant. By contrast, aspell, given the same dictionary and mispelled
> words, gives much more accurate suggestions.
>
> I therefore wrote an implementation of SolrSpellChecker that wraps jazzy,
> the java aspell library. I also extended the SpellCheckComponent to take
> the
> matrix of suggested words and query the corpus to find the first
> combination
> of suggestions which returned a match. This works well for my use case,
> where term frequency is irrelevant to spelling or scoring.
>
> I'd like to publish the code in case someone finds it useful (although it's
> a bit crude at the moment and will need a decent tidy up). Would it be
> appropriate to open up a Jira issue for this?
>
> Cheers,
> ~mark
>
> On 27 July 2010 09:33, dan sutton <danbsut...@gmail.com> wrote:
>
> > Hi,
> >
> > I've recently been looking into Spellchecking in solr, and was struck by
> > how
> > limited the usefulness of the tool was.
> >
> > Like most corpora , ours contains lots of different spelling mistakes for
> > the same word, so the 'spellcheck.onlyMorePopular' is not really that
> > useful
> > unless you click on it numerous times.
> >
> > I was thinking that since most of the time people spell words correctly
> why
> > was there no other frequency parameter that could enter into the score?
> > i.e.
> > something like:
> >
> > spell_score ~ edit_dist * freq
> >
> > I'm sure others have come across this issue and was wonding what
> > steps/algorithms they have used to overcome these limitations?
> >
> > Cheers,
> > Dan
> >
>

Reply via email to