Hi,

in my experience it's the best way to create N-Grams for the Asian texts. I 
think basic CJKAnalyzers already do it this way.

Manuel

Von: Vitaly Artemov [mailto:vitalyarte...@gmail.com]
Gesendet: Donnerstag, 22. November 2012 11:23
An: clucene-developers@lists.sourceforge.net
Betreff: Re: [CLucene-dev] Creating CLucene Index in a Database; Support for 
Asian languages

One more question about Asian languages:
I know that in Asian languages word boundaries are difficult issue.
How are you tokenize Asian texts?
Thank you, Vitaly
On Thu, Nov 22, 2012 at 12:18 PM, Vitaly Artemov 
<vitalyarte...@gmail.com<mailto:vitalyarte...@gmail.com>> wrote:
Thank you for your fast reply.
Can you please explain why Filesystem store better than Database.
We will use CLucene to index and search huge amount of data.
Vitaly

On Thu, Nov 22, 2012 at 11:40 AM, Itamar Syn-Hershko 
<ita...@code972.com<mailto:ita...@code972.com>> wrote:
inline

On Thu, Nov 22, 2012 at 11:15 AM, Vitaly Artemov 
<vitalyarte...@gmail.com<mailto:vitalyarte...@gmail.com>> wrote:

Hello all,
I starting to evaluate Clucene engine for using in our product.
I have 2 questions.

1. Is It planned to add support(or it already exists) for creating index in
the Database instead of memory or filesystem?
    I read that java Lucene has it by providing JdbcDirectory interface.

Don't do that. Use the filesystem, it is much better for every aspect.


2. I read in the FAQ that:
 "CLucene is not limited to English, nor any other language. To index text
properly, you need to use an Analyzer appropriate for the language of the
text you are indexing. CLucene's default Analyzers work well for English.
There are a    number of other Analyzers in "CLucene Sandbox", including
those for Chinese, Japanese, and Korean."
   But "CLucene Sandbox" link is not works for some reason. Can you specify
link to Analyzers list?

Take a look at CJKAnalyzer


Thanks in advance, Vitaly

------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net<mailto:CLucene-developers@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/clucene-developers


------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net<mailto:CLucene-developers@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/clucene-developers


------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/clucene-developers

Reply via email to