Re: different analyzer all produce the same index?

2004-10-04 Thread sergiu gordea
Daan Hoogland wrote:
H all,
I try to create different indices using different Analyzer-classes. I 
tried standard, german, russian, and cjk. They all produce exactly the 
same index file (md5-wise). There are over 280 pages so I expected at 
least some differences.

 

Take a look in the lucene source code... Maybe you will find the answer ...
I asume that all the pages you indexed were written in English, 
therefore is normal that german, russian and cjk analyzers to
create identic indexex, but htey should be different  than english one 
(StandardAnalyzer)

All the best,
Sergiu
Any ideas anyone?
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: different analyzer all produce the same index?

2004-10-04 Thread Morus Walter
sergiu gordea writes:
 Daan Hoogland wrote:
 
 H all,
 
 I try to create different indices using different Analyzer-classes. I 
 tried standard, german, russian, and cjk. They all produce exactly the 
 same index file (md5-wise). There are over 280 pages so I expected at 
 least some differences.
 
   
 
 Take a look in the lucene source code... Maybe you will find the answer ...
 I asume that all the pages you indexed were written in English, 
 therefore is normal that german, russian and cjk analyzers to
 create identic indexex, but htey should be different  than english one 
 (StandardAnalyzer)
 
german analyzer definitely won't leave english text as it is, since it
does algorithmic stemming.
E.g. your text get's
tak a look in the luc sourc cod mayb you will find the answ i asum tha all the pag you 
indexed wer writt in english therefor is normal tha germa russia and cjk analyx to 
crea identic indexex but htey should be diff tha english one standardanalyx
  while std analyzer does not stem at all and gives
take a look in the lucene source code maybe you will find the answer i asume that all 
the pages you indexed were written in english therefore is normal that german russian 
and cjk analyzers to create identic indexex but htey should be different than english 
one standardanalyzer

I'd rather suspect some problem with the indexing code.
So my advice is, to check what the analyzer produces.

Morus

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]