Dave Spencer wrote:
Otis Gospodnetic wrote:
Maybe the spellchecker at the bottom of the following URL will help:
http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/
Yeah, I did this, the "ngram based spelling corrector".
You build a normal lucene index as you always do
then run NGramSpeller, analyzes your index to determine which ngrams are used, and saves this in a separate Lucene index
then you call NGramSpeller.suggestUsingNGrams() if a users query doesn't return too many results
weblog entry here w/ more info and a test page:
http://www.searchmorph.com/weblog/index.php?id=23
Oh and if not obvious from the above, the code is in use live. I searchmorph has a search engine of javadoc pages. Here I search for "hashmep" (intending 'hashmap')
http://www.searchmorph.com/kat/search.jsp?s=hashmep
See the suggestions after the text "I cannot find hashmep anywhere. Instead try these variations..." and note that it read my mind :) and hashmap is the first suggestion.
--
Some chance you'll be instested in the "more like this" similarity query generator - see the "similar" tree in the sandbox
-- Dave
Otis
--- "Stefan F. Keller" <[EMAIL PROTECTED]> wrote:
We would like to add "Did you mean..." to our Lucene-based search engine www.geometa.info. Doug mentioned in his recent interview that this feature would be not too complicated to implement.
First I considered integrating a spelling checker (through JADT-API) but one would rather expect "nearby" words which really exist in the document pool. Some people have mentioned this feature here (or on the java-user-list).
=> Is anyone aware of any real developments in this area? Ideally, one would combine the data already maintained by the IndexReader class with an existing similarity search algorithm (like trigram)...
=> Any ideas?
Stefan
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
