[ http://issues.apache.org/jira/browse/SOLR-41?page=all ]

Yonik Seeley resolved SOLR-41.
------------------------------

    Resolution: Fixed
      Assignee: Yonik Seeley

Thanks Boris, I just committed this.

> PATCH: HyphenatedWordsFilter, Factory and test
> ----------------------------------------------
>
>                 Key: SOLR-41
>                 URL: http://issues.apache.org/jira/browse/SOLR-41
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Boris Vitez
>         Assigned To: Yonik Seeley
>            Priority: Minor
>         Attachments: HyphenatedWordsFilter.java, hyphenatedwordsfilter.patch, 
> hyphenatedwordsfilter.patch, HyphenatedWordsFilterFactory.java, 
> TestHyphenatedWordsFilter.java
>
>
> When the plain text is extracted from documents, we will often have many 
> words hyphenated and broken into two lines. This is often the case with 
> documents where narrow text columns are used, such as newsletters.
> In order to increase searching efficiency, this filter unites hyphenated 
> words broken in two lines.
> This filter has to be used together with the WordDelimiterFilter having 
> catenateWords=1.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to