Hi Breck, thanks for your answer. >> >> With lucenes spellcheck contribution I am not really satisfied because >> the Index has some (many?) mispelled words, so the did you mean class >> (from the jave.net example) is good in finding similar mispelled words. >> With the similarWords Function the correct word is only around Position >> 2-5 - though it should be more frequent in the index. > > Not quite sure I understand what the issue is here. Is it that the > similarWords returns ranked words and the correct one is too far down > the ranked list?
Yes that is exactly the problem. The problem is even worse when searching with multiple words, because the corrected query has often no results. Another part of the problem are that there are some (many ?) typos in the search_index. >> What about performance? > > Tuning params dominate the performance space. A small beam (16 active > hypotheses) will be quite snappy (I have 200 queries/sec with a 32 beam. > over a 80 gig text collection that with some pruning was 5 gig in memory > running an 8 gram model) > That's really impressive (though I didn't understand what you mean with "beams"). Did I unterstand the license term correctly, that I could use Lingpipe for free when I am building a Search Engine for a Academic Website (for free use)? thanks, martin > Tuning is a big deal and I need to write a tuning tutorial. I am doing > more teaching/training now so that may happen. > > > breck > >> >> >> Does anybody have a good idea how to find typos in the index. >> >> tia, >> martin >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > -- Universitaetsbibliothek Heidelberg Tel: +49 6221 54-2580 Ploeck 107-109, D-69117 Heidelberg Fax: +49 6221 54-2623 --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]