On Apr 20, 2005, at 1:11 PM, Doug Cutting wrote:
Byron Miller wrote:At a quick glance, is this using an existing index to build the ngrams from or is this an index from a dictionary source?
I think it uses your existing index, to create a domain-specific corrector.
It can use an existing index to pull the terms from a specific field, or it can be handed a text file to process. Or you could create your own Dictionary implementation that pulls them from elsewhere.
I've recently implemented, but not deployed, the spell checker with the lucenebook.com site. The issue that I had to deal with was that we use stemming on the main index, so generating ngrams from stemmed words is not desirable. So I ended up writing two indexes, one to RAM without stemming, and then using that RAM index to generate the spell checker index.
Erik
