[ https://issues.apache.org/jira/browse/LUCENE-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12863378#action_12863378 ]
Steven Rowe commented on LUCENE-2400: ------------------------------------- Thanks Uwe! > ShingleFilter: don't output all-filler shingles/unigrams; also, convert from > TermAttribute to CharTermAttribute > --------------------------------------------------------------------------------------------------------------- > > Key: LUCENE-2400 > URL: https://issues.apache.org/jira/browse/LUCENE-2400 > Project: Lucene - Java > Issue Type: Improvement > Components: contrib/analyzers > Affects Versions: 3.0.1 > Reporter: Steven Rowe > Assignee: Uwe Schindler > Priority: Minor > Attachments: LUCENE-2400.patch, LUCENE-2400.patch, LUCENE-2400.patch, > LUCENE-2400.patch > > > When the input token stream to ShingleFilter has position increments greater > than one, filler tokens are inserted for each position for which there is no > token in the input token stream. As a result, unigrams (if configured) and > shingles can be filler-only. Filler-only output tokens make no sense - these > should be removed. > Also, because TermAttribute has been deprecated in favor of > CharTermAttribute, the patch will also convert TermAttribute usages to > CharTermAttribute in ShingleFilter. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org