[
http://jira.nuxeo.org/browse/NXP-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=37711#action_37711
]
Narcis Paslaru commented on NXP-2434:
--------------------------------------
I've resolved it by using a filter that eliminates all the accents :
ISOLatin1AccentFilter.
In order to do this one must first create a filter provider that should return
an instance of this filter :
public class NoAccentsFilterProvider implements
LuceneAnalyzerTokenFilterProvider {
/* (non-Javadoc)
* @see
org.compass.core.lucene.engine.analyzer.LuceneAnalyzerTokenFilterProvider#createTokenFilter(org.apache.lucene.analysis.TokenStream)
*/
public TokenFilter createTokenFilter(TokenStream input) {
return new ISOLatin1AccentFilter(input);
}
}
And register it in the compass.cfg.xml file :
<analyzer name="french" type="CustomAnalyzer"
analyzerClass="org.apache.lucene.analysis.fr.FrenchAnalyzer"
filters="noAccentsFilter"/>
<analyzerFilter name="noAccentsFilter"
type="org.nuxeo.search.compass.NoAccentsFilterProvider"/>
And reindex the repository.
This should resolve the Example 2 from the description, but not the case about
the MAJUSCULES.
In order to resolve this, one must override the default search seam action and
set the property of the search document to it's value.toLowerCase().
> French search incoherence
> -------------------------
>
> Key: NXP-2434
> URL: http://jira.nuxeo.org/browse/NXP-2434
> Project: Nuxeo Enterprise Platform
> Issue Type: Bug
> Components: Query / Search
> Reporter: Narcis Paslaru
> Assignee: Thierry Delprat
> Attachments: SearchEngineBackendTestCase.java,
> SharedTestDataBuilder.java
>
>
> Here are two examples to ilustrate the misbehaviour of the search engine when
> using the french analyzer :
> Example 1 :
> In the database, there are two documents called :
> - Entrant Itinéris
> - Sortante Itineris
>
> When I search for "itinéris" : the two of them are returned. => OK!
> When I search for "itineris" : the two of them are returned. => OK!
> When I search for "ITINERIS" : none of them are returned. => non OK!
>
>
> Example 2 :
> In the database, there are two documents called :
> - CA équipement
> - CA équipement social
>
> When I search for "équipement" : the two of them are returned. => OK!
> When I search for "equipement" : none of them are returned. => non OK! :-(
> When I search for "EQUIPEMENT" : none of them are returned. => non OK! :-(
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://jira.nuxeo.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
_______________________________________________
ECM-tickets mailing list
[email protected]
http://lists.nuxeo.com/mailman/listinfo/ecm-tickets