[ 
https://issues.apache.org/jira/browse/SOLR-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020221#comment-14020221
 ] 

Jeremy Anderson commented on SOLR-5379:
---------------------------------------

I'm in the process of trying to get this logic ported into the 4.8.1 Released 
Tag.  I believe I've gotten the code ported over, but am having problems 
getting the unit test to run to confirm the correctness of the port.  The main 
reason is the differences in the conf/solrconfig.xml and conf/schema.xml files 
that exist in the root and I'm guessing those used by Tien when the 4.5.0 patch 
was created.  

I'm still a SOLR novice so I'm not quite sure how to properly replicate the 
schema and configuration settings to get the unit test to run.  I'm going to 
attach patch files shortly for the 4.8.1 code base along with the current 
stubbed out configuration files.

Any help anyone can provide would be greatly appreciated.  My end goal is to 
hopefully be able to get the multi-term synonym expansion logic to work with a 
4.8.1 deployment where we're using an extended version of the SolrQueryParser.  
(I'm not sure if the multi-term synonym logic is only usable with this patch by 
the new SynonymQuotedDismaxQParser or existing DismaxQarsers).

Notes on 4.8.1 port:
* There is now 2 parsers usable by the FSTSynonymFilterFactory: 
SolrSynonymParser & WordnetSynonymParser.  The later of which I'm not sure if 
any additional logic needs to be implemented for proper usage of the tokenize 
parameter.
* All of the logic implemented in SolrQueryParserBase from 4.5.0 has now been 
moved into the utility QueryBuilder class.


> Query-time multi-word synonym expansion
> ---------------------------------------
>
>                 Key: SOLR-5379
>                 URL: https://issues.apache.org/jira/browse/SOLR-5379
>             Project: Solr
>          Issue Type: Improvement
>          Components: query parsers
>            Reporter: Tien Nguyen Manh
>              Labels: multi-word, queryparser, synonym
>             Fix For: 4.9, 5.0
>
>         Attachments: quoted.patch, synonym-expander.patch
>
>
> While dealing with synonym at query time, solr failed to work with multi-word 
> synonyms due to some reasons:
> - First the lucene queryparser tokenizes user query by space so it split 
> multi-word term into two terms before feeding to synonym filter, so synonym 
> filter can't recognized multi-word term to do expansion
> - Second, if synonym filter expand into multiple terms which contains 
> multi-word synonym, The SolrQueryParseBase currently use MultiPhraseQuery to 
> handle synonyms. But MultiPhraseQuery don't work with term have different 
> number of words.
> For the first one, we can extend quoted all multi-word synonym in user query 
> so that lucene queryparser don't split it. There are a jira task related to 
> this one https://issues.apache.org/jira/browse/LUCENE-2605.
> For the second, we can replace MultiPhraseQuery by an appropriate BoleanQuery 
> SHOULD which contains multiple PhraseQuery in case tokens stream have 
> multi-word synonym.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to