[ 
https://issues.apache.org/jira/browse/LUCENE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13598791#comment-13598791
 ] 

Uwe Schindler commented on LUCENE-4822:
---------------------------------------

The good thing with isKeyword being parameter-free is the possibility to also 
mark "keywords" based on other attributes (e.g. when Kumoroji sets a specific 
base form, or if a token has no position increment, or if you have a certain 
unicode range like arabic chars that should never be passed to a stemmer,...; 
the examples maybe nonsense but shows the possibilities).
                
> Add PatternKeywordTokenFilter to marks keywords based on regular expressions
> ----------------------------------------------------------------------------
>
>                 Key: LUCENE-4822
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4822
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/analysis
>    Affects Versions: 4.2
>            Reporter: Simon Willnauer
>            Priority: Minor
>             Fix For: 5.0, 4.3
>
>         Attachments: LUCENE-4822.patch, LUCENE-4822.patch
>
>
> today we need to pass in an explicit set of terms that we want to marks as 
> keywords. It might make sense to allow patterns as well to prevent certain 
> suffixes etc. to be keyworded.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to