PATCH: HyphenatedWordsFilter, Factory and test
----------------------------------------------

                 Key: SOLR-41
                 URL: http://issues.apache.org/jira/browse/SOLR-41
             Project: Solr
          Issue Type: New Feature
          Components: search
            Reporter: Boris Vitez
            Priority: Minor


When the plain text is extracted from documents, we will often have many words 
hyphenated and broken into two lines. This is often the case with documents 
where narrow text columns are used, such as newsletters.
In order to increase searching efficiency, this filter unites hyphenated words 
broken in two lines.
This filter has to be used together with the WordDelimiterFilter having 
catenateWords=1.


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to