[
https://issues.apache.org/jira/browse/LUCENE-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16026129#comment-16026129
]
Dawid Weiss commented on LUCENE-7848:
-------------------------------------
{code}
Analyzer a4 = new Analyzer() {
@Override
public TokenStreamComponents createComponents(String field) {
final int flags2 =
GENERATE_WORD_PARTS | PRESERVE_ORIGINAL | GENERATE_NUMBER_PARTS |
STEM_ENGLISH_POSSESSIVE;
Tokenizer tokenizer = new MockTokenizer(MockTokenizer.WHITESPACE,
false);
StopFilter filter = new StopFilter(tokenizer,
StandardAnalyzer.STOP_WORDS_SET);
return new TokenStreamComponents(tokenizer, new
WordDelimiterGraphFilter(filter, flags2, protWords));
}
};
String in = "aaa,bbb foo - bar";
PrintWriter pw = new PrintWriter(System.out);
new TokenStreamToDot(in, a4.tokenStream("", in), pw).toDot();
pw.flush();
{code}
Here's the analyzer chain that I used for testing. I can't provide a full test
-- it was just an ad-hoc hacking session.
> QueryBuilder.analyzeGraphPhrase does not handle gaps correctly
> --------------------------------------------------------------
>
> Key: LUCENE-7848
> URL: https://issues.apache.org/jira/browse/LUCENE-7848
> Project: Lucene - Core
> Issue Type: Bug
> Affects Versions: 6.5, 6.6
> Reporter: Jim Ferenczi
>
> Position increments greater than 1 are ignored when the query builder creates
> a graph phrase query.
> Instead it should use SpanNearQuery.addGap for pos incr > 1.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]