[ 
https://issues.apache.org/jira/browse/SOLR-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett closed SOLR-1279.
-----------------------------------
       Resolution: Won't Fix
    Fix Version/s:     (was: 6.0)
                       (was: 4.9)

Closing this because it's been several years with no forward progress, and 
commenters pointed out another way to approach the issue with the 
WordDelimiterFilter. There was also an ApostropheFilterFactory added around 
v4.8, as part of the Turkish language support.

If improvements should be made in this area, let's file new issues for those.

> ApostropheTokenizer
> -------------------
>
>                 Key: SOLR-1279
>                 URL: https://issues.apache.org/jira/browse/SOLR-1279
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Sergey Borisov
>            Priority: Minor
>         Attachments: ApostropheTokenizer.zip
>
>
> ApostropheTokenizer creates extra tokens during the analysis stage for the 
> fields containing apostrophes. The reason for adding this is to ensure that 
> documents that differ only by apostrophe have the same relevancy score. 
> For example, if the document contains string "McDonald's", it will be 
> tokenized as "McDonald's McDonalds". This way when the search is performed 
> against "McDonald's" or "McDonalds" will produce similar score.
> This code handles up to two apostrophes in a token.
> To use this tokenizer add the following line in schema.xml
> <analyzer type="index">
>       <filter class="org.apache.lucene.analysis.ApostropheTokenFactory"/>
> ...
> </analyzer>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to