Re: hello, a question about solr.

Norberto Meijome Mon, 18 Aug 2008 18:12:38 -0700

On Mon, 18 Aug 2008 23:07:19 +0800
"finy finy" <[EMAIL PROTECTED]> wrote:


> because i use chinese character, for example "ibm_______________"
> solr will parse it into a term "ibm" and a phraze "_________ ______"
> can i use solr to query with a term "ibm" and a term "_________"  and a term 
> "______"?

Hi finy,
you should look into n-gram tokenizers. Not sure if it is documented in the 
wiki, but it has been discussed in the mailing list quite a few times.

in short, an n-gram tokenizer breaks your input into blocks of characters of 
size n , which are then used to compare in the index. I think for Chinese , 
bi-gram is the favoured approach.

good luck,
B
_________________________
{Beto|Norberto|Numard} Meijome

I used to hate weddings; all the Grandmas would poke me and
say, "You're next sonny!" They stopped doing that when i
started to do it to them at funerals.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.

Re: hello, a question about solr.

Reply via email to