[
https://issues.apache.org/jira/browse/LUCENE-1491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715849#action_12715849
]
viobade commented on LUCENE-1491:
---------------------------------
I think is better to keep the main goal of ngram: groups of characters between
min and max. If is need in any practical situation for minimum ngram equals
with one or two characters, this can be done setting the minimum....otherwise
the filter must work in the way that is expected.. If I expect subword with
minimum 3 length why do I get a token with two characters while it is not
accomplish the condition?
> EdgeNGramTokenFilter stops on tokens smaller then minimum gram size.
> --------------------------------------------------------------------
>
> Key: LUCENE-1491
> URL: https://issues.apache.org/jira/browse/LUCENE-1491
> Project: Lucene - Java
> Issue Type: Bug
> Components: Analysis
> Affects Versions: 2.4, 2.4.1, 2.9, 3.0
> Reporter: Todd Feak
> Assignee: Otis Gospodnetic
> Fix For: 2.9
>
> Attachments: LUCENE-1491.patch
>
>
> If a token is encountered in the stream that is shorter in length than the
> min gram size, the filter will stop processing the token stream.
> Working up a unit test now, but may be a few days before I can provide it.
> Wanted to get it in the system.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]