[
https://issues.apache.org/jira/browse/SOLR-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894248#comment-13894248
]
Tien Nguyen Manh edited comment on SOLR-5379 at 2/7/14 7:03 AM:
[~markus17] It is not the desired behavious!.
your result above in first example with sync [seabiscuit,sea biscit,biscit]
q=sea biscit = (+(DisjunctionMaxQuery((name:sea))
DisjunctionMaxQuery(((name:seabiscuit name:sea biscit
name:biscit)/no_coord
seem the default behaviour (without the SynonymQuotedDismaxQParser).
After using SynonymQuotedDismaxQParser, it should be the same result for all
three queries q=biscit, q=seabiscuit, q=sea biscit
was (Author: tiennm):
[~markus17] It is not the desired behavious!.
your result above in first example with sync [seabiscuit,sea biscit,biscit]
q=sea biscit = (+(DisjunctionMaxQuery((name:sea))
DisjunctionMaxQuery(((name:seabiscuit name:sea biscit
name:biscit)/no_coord
seem the default behaviour (without the patch).
After appling the patch, it should be the same result for all three queries
q=biscit, q=seabiscuit, q=sea biscit
Query-time multi-word synonym expansion
---
Key: SOLR-5379
URL: https://issues.apache.org/jira/browse/SOLR-5379
Project: Solr
Issue Type: Improvement
Components: query parsers
Reporter: Tien Nguyen Manh
Labels: multi-word, queryparser, synonym
Fix For: 4.7
Attachments: quoted.patch, synonym-expander.patch
While dealing with synonym at query time, solr failed to work with multi-word
synonyms due to some reasons:
- First the lucene queryparser tokenizes user query by space so it split
multi-word term into two terms before feeding to synonym filter, so synonym
filter can't recognized multi-word term to do expansion
- Second, if synonym filter expand into multiple terms which contains
multi-word synonym, The SolrQueryParseBase currently use MultiPhraseQuery to
handle synonyms. But MultiPhraseQuery don't work with term have different
number of words.
For the first one, we can extend quoted all multi-word synonym in user query
so that lucene queryparser don't split it. There are a jira task related to
this one https://issues.apache.org/jira/browse/LUCENE-2605.
For the second, we can replace MultiPhraseQuery by an appropriate BoleanQuery
SHOULD which contains multiple PhraseQuery in case tokens stream have
multi-word synonym.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org