Hi Breck,

thanks for your answer.
>>
>> With lucenes spellcheck contribution I am not really satisfied because
>> the Index has some (many?) mispelled words, so the did you mean class
>> (from the jave.net example) is good in finding similar mispelled words.
>> With the similarWords  Function the correct word is only around Position
>> 2-5  - though it should be more frequent in the index.
> 
> Not quite sure I understand what the issue is here. Is it that the
> similarWords returns ranked words and the correct one is too far down
> the ranked list?

Yes that is exactly the problem. The problem is even worse when
searching with multiple words, because the corrected query has often no
results.  Another part of the problem are that there are some (many ?)
typos in the search_index.

>> What about performance?
> 
> Tuning params dominate the performance space. A small beam (16 active
> hypotheses) will be quite snappy (I have 200 queries/sec with a 32 beam.
> over a 80 gig text collection that with some pruning was 5 gig in memory
> running an 8 gram model)
> 

That's really impressive (though I didn't understand what you mean with
"beams").

Did I unterstand the  license term correctly, that I could use Lingpipe
for free when I am building a Search Engine for a Academic Website (for
free use)?

thanks,
martin

> Tuning is a big deal and I need to write a tuning tutorial. I am doing
> more teaching/training now so that may happen.
> 
> 
> breck
> 
>>
>>
>> Does anybody have a good idea how to find typos in the index.
>>
>> tia,
>> martin
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


-- 
Universitaetsbibliothek Heidelberg   Tel: +49 6221 54-2580
Ploeck 107-109, D-69117 Heidelberg   Fax: +49 6221 54-2623

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to