[ 
https://issues.apache.org/jira/browse/LUCENE-5484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13917381#comment-13917381
 ] 

Lukas Vlcek commented on LUCENE-5484:
-------------------------------------

Excellent, I will give it a try.
Thanks!

> Distinct control of recursion levels for prefix and suffix in Hunspell.
> -----------------------------------------------------------------------
>
>                 Key: LUCENE-5484
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5484
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/analysis
>            Reporter: Lukas Vlcek
>            Priority: Minor
>
> Currently, there is an option to set recursionCap value to control depth of 
> recursion in Hunspell token filter. This recursion enables to apply allowed 
> affix rule to input token and pass output token(s) as an input tokens 
> recursively.
> However, the recursionCap does not allow to distinguish between how many 
> prefix and suffix rules were applied. It just counts for total. For example 
> if recursionCap is set to 1 it actually includes all of the following options:
> - 2 prefix rules, 0 suffix rules
> - 1prefix rule, 1 suffix rule
> - 0 prefix rules, 2 suffix rules
> In some cases it is required to be able to distinguish between prefix rule 
> and suffix rule and have finer control over how many times is each applied. 
> Requested feature should allow setting recursion level separately for prefix 
> and suffix rules.
> Specific example is the Czech dictionary, where it gives best results if 
> suffix rules are applied only once. Hence recursionCap = 0. But if for input 
> token a prefix rule is applied it does not allow to apply suffix rule and 
> produces a token that is not in root form. And setting recursionCap = 1 
> produces too many irrelevant tokens that it makes Hunspell token filter 
> unuseful. Good solution to this problem would be tell Hunspell token filter 
> to apply up to 1 prefix rule and up to 1 suffix rule only (meaning never 
> allow to apply 0 prefix rules and 2 suffix rules).
> Generally, this is probably dependant a lot on how particular dictionary and 
> affix rules are constructed and it might not be considered a generalization 
> but rather an expert feature.
> (There was some relevant discussion going on in LUCENE-5468)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to