Thanks all.  I've got it working now using a KeywordAnalyzer.  The edit
distance metric I'm using is purely "edit" based, i.e. when I input
"Bennett", I get "Jennett", "Gannett", "Kenneth" and THEN "Bennet".
While I see the logic, it's obviously not the best metric.  Is there an
appropriate edit distance metric that takes phonetics into account?

-----Original Message-----
From: Karl Wettin [mailto:[EMAIL PROTECTED] 
Sent: Thursday, January 31, 2008 6:12 AM
To: java-user@lucene.apache.org
Subject: Re: Spell checking street names


30 jan 2008 kl. 17.34 skrev Max Metral:

> Part of the reason is if we look at some common mistakes:
>
>
> For Commonwealth:
>
> Communwealth
>
> Comonwealth
>
> Common wealth

If they are common misstakes you can pick them up using reinforcement  
learning.

<http://issues.apache.org/jira/browse/LUCENE-626> might help you.



    karl

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to