Analyzers, perfect hash, ICU

Karel Tejnora Wed, 11 Jan 2006 08:51:53 -0800

Hi all,

I'm working on the analyzer for the slovanic latin languages (cs,sk)w/o stemming at first.

I would like to ask you:

The StopWord analyzer uses often HashSet implementation, but the theStopwords are not changed often (if ever) from shipped in the java code.Do you think that is there benefit for the perfect hash algorithm?I will do an ICU analyzer for latin chars (decompositing and return basechar). Have you any exp. with icu(.sf.net) some problems, bottlenecks?


Thx,
Karel

P. S.: also I would like these stuff contribute to lucene-contrib ifit'll be recognized useful. Is there any howto set the Eclipse forLucene/Apache related project?


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Analyzers, perfect hash, ICU

Reply via email to