Look at CHANGES.txt document in CVS - there is some new stuff in
org.apache.lucene.analysis.ru package that you will want to use.
Get the Lucene from the nightly build...
Otis
--- Andrey Grishin <[EMAIL PROTECTED]> wrote:
> Hi All,
> I have a problems with searching on Russian content using lucene 1.2
>
> I indexed the content using Cp1251 charset
> ------------
> text = new String(text.getBytes("Cp1251"));
> doc.add(Field.Text(CONTENT_FIELD,text));
>
> ------------
> and I am searching using the same charset
>
> String txt = "���";
> txt = new String(txt.getBytes("Cp1251"));
> PrefixQuery query = new PrefixQuery(new
> Term(PortalHTMLDocument.CONTENT_FIELD, txt));
> hits = searcher.search(query);
>
> or
>
> Analyzer analyzer = new StandardAnalyzer();
> String txt = "������";
> txt = new String(txt.getBytes("Cp1251"));
> Query query = QueryParser.parse(txt,
> PortalHTMLDocument.CONTENT_FIELD, analyzer);
>
> hits = searcher.search(query);
>
>
> and lucene can't find nothing.
> Also I checked for the DecodeInterceptor in my server.xml - there
> isn't any
>
> I tried UTF-8/16 - and got the same result.
>
> Also, if I list all index's content via iterating IndexReader - I can
> see that my russian content is stored in index...
> Can you please help me? Do you have any more ideas about what else
> can be done here to fix this problem?
>
> I will appreciate any help.
> Thanks, Andrey.
>
> P.S.
> I am using lucene 1.2, tomcat 4.1.12, jdk 1.4.1 on Win2000 AS
__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus � Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com
--
To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>