[ https://issues.apache.org/jira/browse/SOLR-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731125#action_12731125 ]
Robert Muir commented on SOLR-1279: ----------------------------------- Sergey, have you looked at SOLR-1266? By using the new stemEnglishPossessive=0 option, I think you can get the same behavior with WordDelimiterFilter, if you use preserveOriginal=1 along with catenateWords=1 > ApostropheTokenizer > ------------------- > > Key: SOLR-1279 > URL: https://issues.apache.org/jira/browse/SOLR-1279 > Project: Solr > Issue Type: New Feature > Components: Analysis > Reporter: Sergey Borisov > Priority: Minor > Fix For: 1.4 > > Attachments: ApostropheTokenizer.zip > > > ApostropheTokenizer creates extra tokens during the analysis stage for the > fields containing apostrophes. The reason for adding this is to ensure that > documents that differ only by apostrophe have the same relevancy score. > For example, if the document contains string "McDonald's", it will be > tokenized as "McDonald's McDonalds". This way when the search is performed > against "McDonald's" or "McDonalds" will produce similar score. > This code handles up to two apostrophes in a token. > To use this tokenizer add the following line in schema.xml > <analyzer type="index"> > <filter class="org.apache.lucene.analysis.ApostropheTokenFactory"/> > ... > </analyzer> -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.