Re: hello, a question about solr.
A tiny but really explanation can be found here http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters 2008/8/18 finy finy [EMAIL PROTECTED] thanks for your help. could you give me your gmail talk address or msn? 2008/8/19, Norberto Meijome [EMAIL PROTECTED]: On Mon, 18 Aug 2008 23:07:19 +0800 finy finy [EMAIL PROTECTED] wrote: because i use chinese character, for example ibm___ solr will parse it into a term ibm and a phraze _ __ can i use solr to query with a term ibm and a term _ and a term __? Hi finy, you should look into n-gram tokenizers. Not sure if it is documented in the wiki, but it has been discussed in the mailing list quite a few times. in short, an n-gram tokenizer breaks your input into blocks of characters of size n , which are then used to compare in the index. I think for Chinese , bi-gram is the favoured approach. good luck, B _ {Beto|Norberto|Numard} Meijome I used to hate weddings; all the Grandmas would poke me and say, You're next sonny! They stopped doing that when i started to do it to them at funerals. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned. -- Alexander Ramos Jardim
Re: hello, a question about solr.
On Wed, 20 Aug 2008 10:58:50 -0300 Alexander Ramos Jardim [EMAIL PROTECTED] wrote: A tiny but really explanation can be found here http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters thanks Alexander - indeed, quite short, and focused on shingles ... which , if I understand correctly, are groups of terms of n size... the ngramtokizer creates tokens of n-characters from your input. Searching for ngram or n-gram in the archives should bring more relevant information up, which isnt in the wiki yet. B _ {Beto|Norberto|Numard} Meijome All that is necessary for the triumph of evil is that good men do nothing. Edmund Burke I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: hello, a question about solr.
the name field is text,which is analysed, i use the query name:ibmT63notebook 2008/8/18, Shalin Shekhar Mangar [EMAIL PROTECTED]: Hi, What is the type of the field name? Does a query like name:ibm OR name:T63 OR name:notebook work for you? On Mon, Aug 18, 2008 at 10:43 AM, finy finy [EMAIL PROTECTED] wrote: i use solr for 3 months, and i find some question follow: i check the solr source code, and find it uses lucene's QueryParser to parse user's input querystring for example, a query like this name:ibmT63notebook ,solr will parse it like 'name:ibm T63 notebook' , it regard this as a PhrazeQuery,so it will use PhrazeQuery. but i want to get a result which include ibm and T63 and notebook at any postion. for example ,it should match some sentence like i have a notebook ,it is t63 of ibm.. but solr doesn't do that,it consider that queryparser as a PhrazeQuery, how can i do that as my mind? thanks, your friend! -- Regards, Shalin Shekhar Mangar.
Re: hello, a question about solr.
On Mon, 18 Aug 2008 15:33:02 +0800 finy finy [EMAIL PROTECTED] wrote: the name field is text,which is analysed, i use the query name:ibmT63notebook why do you search with no spaces? is this free text entered by a user, or is it part of a link which you control ? PS: please dont top-post _ {Beto|Norberto|Numard} Meijome Commitment is active, not passive. Commitment is doing whatever you can to bring about the desired result. Anything less is half-hearted. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: hello, a question about solr.
because i use chinese character, for example ibm笔记本电脑 solr will parse it into a term ibm and a phraze 笔记本 电脑 can i use solr to query with a term ibm and a term 笔记本 and a term 电脑? 2008/8/18, Norberto Meijome [EMAIL PROTECTED]: On Mon, 18 Aug 2008 15:33:02 +0800 finy finy [EMAIL PROTECTED] wrote: the name field is text,which is analysed, i use the query name:ibmT63notebook why do you search with no spaces? is this free text entered by a user, or is it part of a link which you control ? PS: please dont top-post _ {Beto|Norberto|Numard} Meijome Commitment is active, not passive. Commitment is doing whatever you can to bring about the desired result. Anything less is half-hearted. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: hello, a question about solr.
On Mon, 18 Aug 2008 23:07:19 +0800 finy finy [EMAIL PROTECTED] wrote: because i use chinese character, for example ibm___ solr will parse it into a term ibm and a phraze _ __ can i use solr to query with a term ibm and a term _ and a term __? Hi finy, you should look into n-gram tokenizers. Not sure if it is documented in the wiki, but it has been discussed in the mailing list quite a few times. in short, an n-gram tokenizer breaks your input into blocks of characters of size n , which are then used to compare in the index. I think for Chinese , bi-gram is the favoured approach. good luck, B _ {Beto|Norberto|Numard} Meijome I used to hate weddings; all the Grandmas would poke me and say, You're next sonny! They stopped doing that when i started to do it to them at funerals. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: hello, a question about solr.
thanks for your help. could you give me your gmail talk address or msn? 2008/8/19, Norberto Meijome [EMAIL PROTECTED]: On Mon, 18 Aug 2008 23:07:19 +0800 finy finy [EMAIL PROTECTED] wrote: because i use chinese character, for example ibm___ solr will parse it into a term ibm and a phraze _ __ can i use solr to query with a term ibm and a term _ and a term __? Hi finy, you should look into n-gram tokenizers. Not sure if it is documented in the wiki, but it has been discussed in the mailing list quite a few times. in short, an n-gram tokenizer breaks your input into blocks of characters of size n , which are then used to compare in the index. I think for Chinese , bi-gram is the favoured approach. good luck, B _ {Beto|Norberto|Numard} Meijome I used to hate weddings; all the Grandmas would poke me and say, You're next sonny! They stopped doing that when i started to do it to them at funerals. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
hello, a question about solr.
i use solr for 3 months, and i find some question follow: i check the solr source code, and find it uses lucene's QueryParser to parse user's input querystring for example, a query like this name:ibmT63notebook ,solr will parse it like 'name:ibm T63 notebook' , it regard this as a PhrazeQuery,so it will use PhrazeQuery. but i want to get a result which include ibm and T63 and notebook at any postion. for example ,it should match some sentence like i have a notebook ,it is t63 of ibm.. but solr doesn't do that,it consider that queryparser as a PhrazeQuery, how can i do that as my mind? thanks, your friend!
Re: hello, a question about solr.
Hi, What is the type of the field name? Does a query like name:ibm OR name:T63 OR name:notebook work for you? On Mon, Aug 18, 2008 at 10:43 AM, finy finy [EMAIL PROTECTED] wrote: i use solr for 3 months, and i find some question follow: i check the solr source code, and find it uses lucene's QueryParser to parse user's input querystring for example, a query like this name:ibmT63notebook ,solr will parse it like 'name:ibm T63 notebook' , it regard this as a PhrazeQuery,so it will use PhrazeQuery. but i want to get a result which include ibm and T63 and notebook at any postion. for example ,it should match some sentence like i have a notebook ,it is t63 of ibm.. but solr doesn't do that,it consider that queryparser as a PhrazeQuery, how can i do that as my mind? thanks, your friend! -- Regards, Shalin Shekhar Mangar.