[ https://issues.apache.org/jira/browse/LUCENE-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir updated LUCENE-4065: -------------------------------- Attachment: LUCENE-4065_test.patch test case (boiled down from testrandomchains) A much simpler one could be made. > FilteringTokenFilter should never corrupt the tokenstream graph > --------------------------------------------------------------- > > Key: LUCENE-4065 > URL: https://issues.apache.org/jira/browse/LUCENE-4065 > Project: Lucene - Java > Issue Type: Bug > Components: modules/analysis > Reporter: Robert Muir > Attachments: LUCENE-4065_test.patch > > > Currently removers like stopfilter have an option (true/false) to enable > position increments. > If its true: it both inserts gaps where necessary AND propagates gaps down > the stream. > If its false: it does neither, which can totally mess up the tokenstream > graph (e.g. move synonyms to another word). > There are totally valid natural usecases for false, where you don't want gaps > because you want phrasequeries to act as if the word was never actually there. > But 'not inserting gaps' is separate from proper propagation of existing gaps. > So I think we should provide an option (either fix 'false' or make it an > enum), where you still get a legit tokenstream and dont totally screw it up, > but you simply omit gaps. > See LUCENE-3848 for more information (Where we at least fixed this case to > not begin the tokenstream with posinc=0) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org