Dave Spencer wrote:

Otis Gospodnetic wrote:

Maybe the spellchecker at the bottom of the following URL will help:

http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/



Yeah, I did this, the "ngram based spelling corrector".

You build a normal lucene index as you always do
then run NGramSpeller, analyzes your index to determine which ngrams are used, and saves this in a separate Lucene index
then you call NGramSpeller.suggestUsingNGrams() if a users query doesn't return too many results


weblog entry here w/ more info and a test page:

http://www.searchmorph.com/weblog/index.php?id=23


Oh and if not obvious from the above, the code is in use live.
I searchmorph has a search engine of javadoc pages.
Here I search for "hashmep" (intending 'hashmap')

http://www.searchmorph.com/kat/search.jsp?s=hashmep

See the suggestions after the text "I cannot find hashmep anywhere. Instead try these variations..." and note that it read my mind :) and hashmap is the first suggestion.



--

Some chance you'll be instested in the "more like this" similarity query generator - see the "similar" tree in the sandbox

-- Dave

Otis


--- "Stefan F. Keller" <[EMAIL PROTECTED]> wrote:


We would like to add "Did you mean..." to our Lucene-based search
engine www.geometa.info. Doug mentioned in his recent interview that
this feature would be not too complicated to implement.

First I considered integrating a spelling checker (through JADT-API)
but one would rather expect "nearby" words which really exist in the
document pool. Some people have mentioned this feature here (or on
the
java-user-list).

=> Is anyone aware of any real developments in this area?
Ideally, one would combine the data already maintained by the
IndexReader class with an existing similarity search algorithm (like
trigram)...

=> Any ideas?

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to