[
https://issues.apache.org/jira/browse/LUCENE-5484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13917381#comment-13917381
]
Lukas Vlcek commented on LUCENE-5484:
-------------------------------------
Excellent, I will give it a try.
Thanks!
> Distinct control of recursion levels for prefix and suffix in Hunspell.
> -----------------------------------------------------------------------
>
> Key: LUCENE-5484
> URL: https://issues.apache.org/jira/browse/LUCENE-5484
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/analysis
> Reporter: Lukas Vlcek
> Priority: Minor
>
> Currently, there is an option to set recursionCap value to control depth of
> recursion in Hunspell token filter. This recursion enables to apply allowed
> affix rule to input token and pass output token(s) as an input tokens
> recursively.
> However, the recursionCap does not allow to distinguish between how many
> prefix and suffix rules were applied. It just counts for total. For example
> if recursionCap is set to 1 it actually includes all of the following options:
> - 2 prefix rules, 0 suffix rules
> - 1prefix rule, 1 suffix rule
> - 0 prefix rules, 2 suffix rules
> In some cases it is required to be able to distinguish between prefix rule
> and suffix rule and have finer control over how many times is each applied.
> Requested feature should allow setting recursion level separately for prefix
> and suffix rules.
> Specific example is the Czech dictionary, where it gives best results if
> suffix rules are applied only once. Hence recursionCap = 0. But if for input
> token a prefix rule is applied it does not allow to apply suffix rule and
> produces a token that is not in root form. And setting recursionCap = 1
> produces too many irrelevant tokens that it makes Hunspell token filter
> unuseful. Good solution to this problem would be tell Hunspell token filter
> to apply up to 1 prefix rule and up to 1 suffix rule only (meaning never
> allow to apply 0 prefix rules and 2 suffix rules).
> Generally, this is probably dependant a lot on how particular dictionary and
> affix rules are constructed and it might not be considered a generalization
> but rather an expert feature.
> (There was some relevant discussion going on in LUCENE-5468)
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]