RE: bi-grams for common terms - any analyzers do that?

Andy Fri, 24 Sep 2010 22:05:28 -0700

--- On Thu, 9/23/10, Burton-West, Tom <tburt...@umich.edu> wrote:

> It also splits on whitespace which causes all CJK queries
> to be treated as phrase queries regardless of the CJK
> tokenizer you use.


But I thought specialized analyzers like CJKAnalyzer are designed for those 
languages, which don't use whitespace to separate words. 

Isn't it up to the tokenizer, not the QueryParser, to decide how to split the 
query into tokens?

I'm really confused.

If Solr's QueryParser will only split on whitespace no matter what then what is 
the point of using CJKAnalyzer?

It sounds like Solr would be pretty useless for languages like CJK. Is there 
any work around for this? Any CJK sites using Solr?

RE: bi-grams for common terms - any analyzers do that?

Reply via email to