[ https://issues.apache.org/jira/browse/LUCENE-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jim Ferenczi reassigned LUCENE-8137: ------------------------------------ Assignee: Jim Ferenczi > GraphTokenStreamFiniteStrings does not handle position inc > 1 in multi-word > synoyms > ------------------------------------------------------------------------------------ > > Key: LUCENE-8137 > URL: https://issues.apache.org/jira/browse/LUCENE-8137 > Project: Lucene - Core > Issue Type: Bug > Affects Versions: master (8.0), 7.2.1 > Reporter: Jim Ferenczi > Assignee: Jim Ferenczi > Priority: Major > > The automaton built for graph queries that contain multiple multi-word > synonyms does not handle gaps if they appear in the middle of a multi-word > synonym. In such case the token next to the gap is considered as part of the > multi-word synonym. > Stop words that appear before or after multi-word synonyms are handled > correctly in the current version but the synonym rule "part of speech, pos" > for instance does not create the expected query if "of" is removed by a > filter that is set after the synonym_graph. One solution would be to reuse > TokenStreamToAutomaton (with minor changes to add the ability to create token > transitions rather than chars) which preserves gaps (as a transition) in the > produced automaton. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org