[ 
http://jira.nuxeo.org/browse/NXP-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=37711#action_37711
 ] 

Narcis Paslaru  commented on NXP-2434:
--------------------------------------

I've resolved it by using a filter that eliminates all the accents : 
ISOLatin1AccentFilter.

In order to do this one must first create a filter provider that should return 
an instance of this filter :

public class NoAccentsFilterProvider implements
                LuceneAnalyzerTokenFilterProvider {

        /* (non-Javadoc)
         * @see 
org.compass.core.lucene.engine.analyzer.LuceneAnalyzerTokenFilterProvider#createTokenFilter(org.apache.lucene.analysis.TokenStream)
         */
        public TokenFilter createTokenFilter(TokenStream input) {
                return new ISOLatin1AccentFilter(input);
        }
}

And register it in the compass.cfg.xml file :

      <analyzer name="french" type="CustomAnalyzer"
        analyzerClass="org.apache.lucene.analysis.fr.FrenchAnalyzer" 
filters="noAccentsFilter"/>
      <analyzerFilter name="noAccentsFilter"    
type="org.nuxeo.search.compass.NoAccentsFilterProvider"/>

And reindex the repository.

This should resolve the Example 2 from the description, but not the case about 
the MAJUSCULES.

In order to resolve this, one must override the default search seam action and 
set the property of the search document to it's value.toLowerCase().


> French search incoherence
> -------------------------
>
>                 Key: NXP-2434
>                 URL: http://jira.nuxeo.org/browse/NXP-2434
>             Project: Nuxeo Enterprise Platform
>          Issue Type: Bug
>          Components: Query / Search
>            Reporter: Narcis Paslaru 
>            Assignee: Thierry Delprat
>         Attachments: SearchEngineBackendTestCase.java, 
> SharedTestDataBuilder.java
>
>
> Here are two examples to ilustrate the misbehaviour of the search engine when 
> using the french analyzer :
> Example 1 :
>  In the database, there are two documents called :
>      - Entrant Itinéris
>      - Sortante Itineris
>   
>  When I search for "itinéris" : the two of them are returned. => OK!
>  When I search for "itineris" : the two of them are returned. => OK!
>  When I search for "ITINERIS" : none of them are returned.  => non OK! 
>   
>   
>  Example 2 :
>  In the database, there are two documents called :
>      - CA équipement
>      - CA équipement social
>   
>  When I search for "équipement" : the two of them are returned. => OK!
>  When I search for "equipement" : none of them are returned. => non OK!  :-( 
>  When I search for "EQUIPEMENT" : none of them are returned.  => non OK!  :-( 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://jira.nuxeo.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
_______________________________________________
ECM-tickets mailing list
[email protected]
http://lists.nuxeo.com/mailman/listinfo/ecm-tickets

Reply via email to