On Mon, 18 Aug 2008 23:07:19 +0800 "finy finy" <[EMAIL PROTECTED]> wrote:
> because i use chinese character, for example "ibm_______________" > solr will parse it into a term "ibm" and a phraze "_________ ______" > can i use solr to query with a term "ibm" and a term "_________" and a term > "______"? Hi finy, you should look into n-gram tokenizers. Not sure if it is documented in the wiki, but it has been discussed in the mailing list quite a few times. in short, an n-gram tokenizer breaks your input into blocks of characters of size n , which are then used to compare in the index. I think for Chinese , bi-gram is the favoured approach. good luck, B _________________________ {Beto|Norberto|Numard} Meijome I used to hate weddings; all the Grandmas would poke me and say, "You're next sonny!" They stopped doing that when i started to do it to them at funerals. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.