Re: [CLucene-dev] Inquiry about CLucene's UTF-8 support

2023-07-14 Thread Kostka Bořivoj
, July 14, 2023 8:27 AM To: clucene-developers@lists.sourceforge.net Subject: Re: [CLucene-dev] Inquiry about CLucene's UTF-8 support Hi Developers, I am attaching the tokens generated from Java Lucene and CLucene. I am getting different tokens for non-latin texts using StandardAnalyser

Re: [CLucene-dev] Inquiry about CLucene's UTF-8 support

2023-07-14 Thread Achyuth Pramod
/wiki/Sigma) > > > > Hope this helps > > > > Borivoj > > > > *From:* Achyuth Pramod [mailto:achyuthpra...@gmail.com] > *Sent:* Monday, July 10, 2023 2:32 PM > *To:* clucene-developers@lists.sourceforge.net > *Subject:* [CLucene-dev] Inquiry about CLuce

Re: [CLucene-dev] Inquiry about CLucene's UTF-8 support

2023-07-10 Thread Kostka Bořivoj
, 2023 2:32 PM To: clucene-developers@lists.sourceforge.net Subject: [CLucene-dev] Inquiry about CLucene's UTF-8 support Dear developers, I am using CLucene in my project and I would like to inquire about the UTF-8 encoding support in the Standard Analyzer. Specifically, I would like to know

[CLucene-dev] Inquiry about CLucene's UTF-8 support

2023-07-10 Thread Achyuth Pramod
Dear developers, I am using CLucene in my project and I would like to inquire about the UTF-8 encoding support in the Standard Analyzer. Specifically, I would like to know if the Standard Analyzer handles tokenization and text processing correctly for non-Latin UTF-8 encoded text. Could you