Indexing and Searching Chinese with SolrNet
Dear all, After reading some pages on the Web, I created the index with the following schema. .. fieldtype name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.ChineseTokenizerFactory/ /analyzer /fieldtype .. It must be correct, right? However, when sending a query though SolrNet, no results are returned. Could you tell me what the reason is? Thanks, LB
Re: Indexing and Searching Chinese with SolrNet
Why creating two threads for the same problem? Anyway, is your servlet container capable of accepting UTF-8 in the URL? Also, is SolrNet capable of handling those characters? To confirm, try a tool like curl. Dear all, After reading some pages on the Web, I created the index with the following schema. .. fieldtype name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.ChineseTokenizerFactory/ /analyzer /fieldtype .. It must be correct, right? However, when sending a query though SolrNet, no results are returned. Could you tell me what the reason is? Thanks, LB
Re: Indexing and Searching Chinese with SolrNet
Bing Li, Go to your Solr Admin page and use the Analysis functionality there to enter some Chinese text and see how it's getting analyzed at index and at search time. This will tell you what is (or isn't) going on. Here it looks like you just defined index-time analysis, so you should see your index-time analysis look very different from your query-time analysis. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Bing Li lbl...@gmail.com To: solr-user@lucene.apache.org Sent: Tue, January 18, 2011 1:30:37 PM Subject: Indexing and Searching Chinese with SolrNet Dear all, After reading some pages on the Web, I created the index with the following schema. .. fieldtype name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.ChineseTokenizerFactory/ /analyzer /fieldtype .. It must be correct, right? However, when sending a query though SolrNet, no results are returned. Could you tell me what the reason is? Thanks, LB
Re: Indexing and Searching Chinese with SolrNet
Dear Jelsma, My servlet container is Tomcat 7. I think it should accept Chinese characters. But I am not sure how to configure it. From the console of Tomcat, I saw that the Chinese characters in the query are not displayed normally. However, it is fine in the Solr Admin page. I am not sure either if SolrNet supports Chinese. If not, how can I interact with Solr on .NET? Thanks so much! LB On Wed, Jan 19, 2011 at 2:34 AM, Markus Jelsma markus.jel...@openindex.iowrote: Why creating two threads for the same problem? Anyway, is your servlet container capable of accepting UTF-8 in the URL? Also, is SolrNet capable of handling those characters? To confirm, try a tool like curl. Dear all, After reading some pages on the Web, I created the index with the following schema. .. fieldtype name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.ChineseTokenizerFactory/ /analyzer /fieldtype .. It must be correct, right? However, when sending a query though SolrNet, no results are returned. Could you tell me what the reason is? Thanks, LB
Re: Indexing and Searching Chinese with SolrNet
Hi, Yes but Tomcat might need to be configured to accept, see the wiki for more information on this subject. http://wiki.apache.org/solr/SolrTomcat#URI_Charset_Config Cheers, Dear Jelsma, My servlet container is Tomcat 7. I think it should accept Chinese characters. But I am not sure how to configure it. From the console of Tomcat, I saw that the Chinese characters in the query are not displayed normally. However, it is fine in the Solr Admin page. I am not sure either if SolrNet supports Chinese. If not, how can I interact with Solr on .NET? Thanks so much! LB On Wed, Jan 19, 2011 at 2:34 AM, Markus Jelsma markus.jel...@openindex.iowrote: Why creating two threads for the same problem? Anyway, is your servlet container capable of accepting UTF-8 in the URL? Also, is SolrNet capable of handling those characters? To confirm, try a tool like curl. Dear all, After reading some pages on the Web, I created the index with the following schema. .. fieldtype name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.ChineseTokenizerFactory/ /analyzer /fieldtype .. It must be correct, right? However, when sending a query though SolrNet, no results are returned. Could you tell me what the reason is? Thanks, LB
Re: Indexing and Searching Chinese with SolrNet
Dear Jelsma, After configuring the Tomcat URIEncoding, Chinese characters can be processed correctly. I appreciate so much for your help! Best, LB On Wed, Jan 19, 2011 at 3:02 AM, Markus Jelsma markus.jel...@openindex.iowrote: Hi, Yes but Tomcat might need to be configured to accept, see the wiki for more information on this subject. http://wiki.apache.org/solr/SolrTomcat#URI_Charset_Config Cheers, Dear Jelsma, My servlet container is Tomcat 7. I think it should accept Chinese characters. But I am not sure how to configure it. From the console of Tomcat, I saw that the Chinese characters in the query are not displayed normally. However, it is fine in the Solr Admin page. I am not sure either if SolrNet supports Chinese. If not, how can I interact with Solr on .NET? Thanks so much! LB On Wed, Jan 19, 2011 at 2:34 AM, Markus Jelsma markus.jel...@openindex.iowrote: Why creating two threads for the same problem? Anyway, is your servlet container capable of accepting UTF-8 in the URL? Also, is SolrNet capable of handling those characters? To confirm, try a tool like curl. Dear all, After reading some pages on the Web, I created the index with the following schema. .. fieldtype name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.ChineseTokenizerFactory/ /analyzer /fieldtype .. It must be correct, right? However, when sending a query though SolrNet, no results are returned. Could you tell me what the reason is? Thanks, LB
Re: Indexing and Searching Chinese with SolrNet
Make sure your browser is set to UTF-8 encoding. - Original Message From: Otis Gospodnetic otis_gospodne...@yahoo.com To: solr-user@lucene.apache.org; bing...@asu.edu Sent: Tue, January 18, 2011 10:39:16 AM Subject: Re: Indexing and Searching Chinese with SolrNet Bing Li, Go to your Solr Admin page and use the Analysis functionality there to enter some Chinese text and see how it's getting analyzed at index and at search time. This will tell you what is (or isn't) going on. Here it looks like you just defined index-time analysis, so you should see your index-time analysis look very different from your query-time analysis. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Bing Li lbl...@gmail.com To: solr-user@lucene.apache.org Sent: Tue, January 18, 2011 1:30:37 PM Subject: Indexing and Searching Chinese with SolrNet Dear all, After reading some pages on the Web, I created the index with the following schema. .. fieldtype name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.ChineseTokenizerFactory/ /analyzer /fieldtype .. It must be correct, right? However, when sending a query though SolrNet, no results are returned. Could you tell me what the reason is? Thanks, LB