Hi Mohan,

I ran your Case #1 through Solr 4.9.0’s Admin UI Analysis pane and I can see 
the analyzer for the field type “text_ar" analyzer does not remove all 
diacritics:

Indexed original: المؤسسة التجارية العمانية
Indexed analyzed: مؤسس تجار عمان

Query original: الموسسة التجارية
Query analyzed: موسس تجار

The analyzed query terms are the same as the first two analyzed indexed terms, 
with one exception: the hamza on the waw in the analyzed indexed term “مؤسس” 
was not stripped off by the analyzer, and so won’t match the analyzed query 
term “موسس”, which was entered by the user without the hamza.

Adding ICUFoldingFilterFactory to the “text_ar” field type fixed case #1 for me 
by stripping the hamza from the waw.  You can read more about this filter in 
the Solr Reference Guide (yes, this is basically for Solr 6.4, but I don’t 
think this functionality has changed between 4.9 and 6.4): 
<https://cwiki.apache.org/confluence/display/solr/Filter+Descriptions#FilterDescriptions-ICUFoldingFilter>.
  If you do this, you can remove the LowerCaseFilterFactory since 
ICUFoldingFilterFactory performs lowercasing as part of its work.

Note that to use ICUFoldingFilterFactory you must add three jars to the lib/ 
directory in your solr home dir.  Here’s how I did it:

$ mkdir example/solr/lib
$ cp dist/solr-analysis-extras-4.9.0.jar example/solr/lib/
$ cp contrib/analysis-extras/lucene-libs/lucene-analyzers-icu-4.9.0.jar 
example/solr/lib/
$ cp contrib/analysis-extras/lib/icu4j-53.1.jar example/solr/lib/

--
Steve
www.lucidworks.com 

> On Feb 1, 2017, at 6:50 AM, mohanmca01 <mohanmc...@gmail.com> wrote:
> 
> Dear Steve,Thanks for investigating our problem. Our project is basically
> business directory search platform, and we have more than 100+ K business
> details information. I’m providing you some examples of Arabic words to
> reproduce the problem. please find attached word file where i explained
> everything along with screenshots. arabicSearch.docx
> <http://lucene.472066.n3.nabble.com/file/n4318227/arabicSearch.docx> 
> regarding upgrading to the latest version, our project is running on Java
> 1.7V, and if i need to upgrade then we have to upgrade Java, Application
> Server JBoos, and etc. which is not that right time to do this activity at
> all..!!
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Arabic-words-search-in-solr-tp4317733p4318227.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to