[ https://issues.apache.org/jira/browse/LUCENE-7708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jim Ferenczi updated LUCENE-7708: --------------------------------- Attachment: LUCENE-7708.patch Thanks Steve ! I pushed a new patch that solves the tests failures. > Track PositionLengthAttribute abuse > ----------------------------------- > > Key: LUCENE-7708 > URL: https://issues.apache.org/jira/browse/LUCENE-7708 > Project: Lucene - Core > Issue Type: Bug > Components: core/queryparser, modules/analysis > Reporter: Jim Ferenczi > Attachments: LUCENE-7708.patch, LUCENE-7708.patch > > > Some token filters uses the position length attribute of the token stream to > encode the number of terms they put in a single token. > This breaks the query parsing because it creates disconnected graph. > I've tracked down the abusive case to 2 candidates: > * ShingleFilter which sets the position length attribute to the length of the > shingle. > * CJKBigramFilter which always sets the position length attribute to 2. > I don't think these filters should set the position length at all so the best > would be to remove the attribute from these token filters but this could > break BWC. > Though this is a serious bug since shingles and cjk bigram now produce > invalid queries. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org