Re: Re: Solr edismax parser with multi-word synonyms

2019-07-18 Thread Sunil Srinivasan
Hi Erick, 
Is there anyway I can get it to match documents containing at least one of the 
words of the original query? i.e. 'frozen' or 'dinner' or both. (But not 
partial matches of the synonyms)
Thanks,Sunil


-Original Message-
From: Erick Erickson 
To: solr-user 
Sent: Thu, Jul 18, 2019 04:42 AM
Subject: Re: Solr edismax parser with multi-word synonyms


This is not a phrase query, rather it’s requiring either pair of words
to appear in the title.

You’ve told it that “frozen dinner” and “microwave foods” are synonyms. 
So it’s looking for both the words “microwave” and “foods” in the title field, 
or “frozen” and “dinner” in the title field.

You’d see the same thing with single-word synonyms, albeit a little less
confusingly.


Best,
Erick


> On Jul 18, 2019, at 1:01 AM, kshitij tyagi  
> wrote:
> 
> Hi sunil,
> 
> 1. as you have added "microwave food" in synonym as a multiword synonym to
> "frozen dinner", edismax parsers finds your synonym in the file and is
> considering your query as a Phrase query.
> 
> This is the reason you are seeing parsed query as  +(((+title:microwave
> +title:food) (+title:frozen +title:dinner))), frozen dinner is considered
> as a phrase here.
> 
> If you want partial match on your query then you can add frozen dinner,
> microwave food, microwave, food to your synonym file and you will see the
> parsed query as:
> "+(((+title:microwave +title:food) title:miccrowave title:food
> (+title:frozen +title:dinner)))"
> Another option is to write your own custom query parser and use it as a
> plugin.
> 
> Hope this helps!!
> 
> kshitij
> 
> 
> On Thu, Jul 18, 2019 at 9:14 AM Sunil Srinivasan  wrote:
> 
>> 
>> I have enabled the SynonymGraphFilter in my field configuration in order
>> to support multi-word synonyms (I am using Solr 7.6). Here is my field
>> configuration:
>> 
>>    
>>      
>>    
>> 
>>    
>>      
>>      > synonyms="synonyms.txt"/>
>>    
>> 
>> 
>> 
>> 
>> And this is my synonyms.txt file:
>> frozen dinner,microwave food
>> 
>> Scenario 1: blue shirt (query with no synonyms)
>> 
>> Here is my first Solr query:
>> 
>> http://localhost:8983/solr/base/search?q=blue+shirt=title=edismax=on
>> 
>> And this is the parsed query I see in the debug output:
>> +((title:blue) (title:shirt))
>> 
>> Scenario 2: frozen dinner (query with synonyms)
>> 
>> Now, here is my second Solr query:
>> 
>> http://localhost:8983/solr/base/search?q=frozen+dinner=title=edismax=on
>> 
>> And this is the parsed query I see in the debug output:
>> +(((+title:microwave +title:food) (+title:frozen +title:dinner)))
>> 
>> I am wondering why the first query looks for documents containing at least
>> one of the two query tokens, whereas the second query looks for documents
>> with both of the query tokens? I would understand if it looked for both the
>> tokens of the synonyms (i.e. both microwave and food) to avoid the
>> sausagization problem. But I would like to get partial matches on the
>> original query at least (i.e. it should also match documents containing
>> just the token 'dinner').
>> 
>> Would any one know why the behavior is different across queries with and
>> without synonyms? And how could I work around this if I wanted partial
>> matches on queries that also have synonyms?
>> 
>> Ideally, I would like the parsed query in the second case to be:
>> +(((+title:microwave +title:food) (title:frozen title:dinner)))
>> 
>> I'd appreciate any help with this. Thanks!
>> 


Re: Solr edismax parser with multi-word synonyms

2019-07-18 Thread Erick Erickson
This is not a phrase query, rather it’s requiring either pair of words
to appear in the title.

You’ve told it that “frozen dinner” and “microwave foods” are synonyms. 
So it’s looking for both the words “microwave” and “foods” in the title field, 
or “frozen” and “dinner” in the title field.

You’d see the same thing with single-word synonyms, albeit a little less
confusingly.


Best,
Erick


> On Jul 18, 2019, at 1:01 AM, kshitij tyagi  
> wrote:
> 
> Hi sunil,
> 
> 1. as you have added "microwave food" in synonym as a multiword synonym to
> "frozen dinner", edismax parsers finds your synonym in the file and is
> considering your query as a Phrase query.
> 
> This is the reason you are seeing parsed query as  +(((+title:microwave
> +title:food) (+title:frozen +title:dinner))), frozen dinner is considered
> as a phrase here.
> 
> If you want partial match on your query then you can add frozen dinner,
> microwave food, microwave, food to your synonym file and you will see the
> parsed query as:
> "+(((+title:microwave +title:food) title:miccrowave title:food
> (+title:frozen +title:dinner)))"
> Another option is to write your own custom query parser and use it as a
> plugin.
> 
> Hope this helps!!
> 
> kshitij
> 
> 
> On Thu, Jul 18, 2019 at 9:14 AM Sunil Srinivasan  wrote:
> 
>> 
>> I have enabled the SynonymGraphFilter in my field configuration in order
>> to support multi-word synonyms (I am using Solr 7.6). Here is my field
>> configuration:
>> 
>>
>>  
>>
>> 
>>
>>  
>>  > synonyms="synonyms.txt"/>
>>
>> 
>> 
>> 
>> 
>> And this is my synonyms.txt file:
>> frozen dinner,microwave food
>> 
>> Scenario 1: blue shirt (query with no synonyms)
>> 
>> Here is my first Solr query:
>> 
>> http://localhost:8983/solr/base/search?q=blue+shirt=title=edismax=on
>> 
>> And this is the parsed query I see in the debug output:
>> +((title:blue) (title:shirt))
>> 
>> Scenario 2: frozen dinner (query with synonyms)
>> 
>> Now, here is my second Solr query:
>> 
>> http://localhost:8983/solr/base/search?q=frozen+dinner=title=edismax=on
>> 
>> And this is the parsed query I see in the debug output:
>> +(((+title:microwave +title:food) (+title:frozen +title:dinner)))
>> 
>> I am wondering why the first query looks for documents containing at least
>> one of the two query tokens, whereas the second query looks for documents
>> with both of the query tokens? I would understand if it looked for both the
>> tokens of the synonyms (i.e. both microwave and food) to avoid the
>> sausagization problem. But I would like to get partial matches on the
>> original query at least (i.e. it should also match documents containing
>> just the token 'dinner').
>> 
>> Would any one know why the behavior is different across queries with and
>> without synonyms? And how could I work around this if I wanted partial
>> matches on queries that also have synonyms?
>> 
>> Ideally, I would like the parsed query in the second case to be:
>> +(((+title:microwave +title:food) (title:frozen title:dinner)))
>> 
>> I'd appreciate any help with this. Thanks!
>> 



Re: Solr edismax parser with multi-word synonyms

2019-07-18 Thread kshitij tyagi
Hi sunil,

1. as you have added "microwave food" in synonym as a multiword synonym to
"frozen dinner", edismax parsers finds your synonym in the file and is
considering your query as a Phrase query.

This is the reason you are seeing parsed query as  +(((+title:microwave
+title:food) (+title:frozen +title:dinner))), frozen dinner is considered
as a phrase here.

If you want partial match on your query then you can add frozen dinner,
microwave food, microwave, food to your synonym file and you will see the
parsed query as:
"+(((+title:microwave +title:food) title:miccrowave title:food
(+title:frozen +title:dinner)))"
 Another option is to write your own custom query parser and use it as a
plugin.

Hope this helps!!

kshitij


On Thu, Jul 18, 2019 at 9:14 AM Sunil Srinivasan  wrote:

>
> I have enabled the SynonymGraphFilter in my field configuration in order
> to support multi-word synonyms (I am using Solr 7.6). Here is my field
> configuration:
> 
> 
>   
> 
>
> 
>   
>synonyms="synonyms.txt"/>
> 
> 
>
> 
>
> And this is my synonyms.txt file:
> frozen dinner,microwave food
>
> Scenario 1: blue shirt (query with no synonyms)
>
> Here is my first Solr query:
>
> http://localhost:8983/solr/base/search?q=blue+shirt=title=edismax=on
>
> And this is the parsed query I see in the debug output:
> +((title:blue) (title:shirt))
>
> Scenario 2: frozen dinner (query with synonyms)
>
> Now, here is my second Solr query:
>
> http://localhost:8983/solr/base/search?q=frozen+dinner=title=edismax=on
>
> And this is the parsed query I see in the debug output:
> +(((+title:microwave +title:food) (+title:frozen +title:dinner)))
>
> I am wondering why the first query looks for documents containing at least
> one of the two query tokens, whereas the second query looks for documents
> with both of the query tokens? I would understand if it looked for both the
> tokens of the synonyms (i.e. both microwave and food) to avoid the
> sausagization problem. But I would like to get partial matches on the
> original query at least (i.e. it should also match documents containing
> just the token 'dinner').
>
> Would any one know why the behavior is different across queries with and
> without synonyms? And how could I work around this if I wanted partial
> matches on queries that also have synonyms?
>
> Ideally, I would like the parsed query in the second case to be:
> +(((+title:microwave +title:food) (title:frozen title:dinner)))
>
> I'd appreciate any help with this. Thanks!
>


Solr edismax parser with multi-word synonyms

2019-07-17 Thread Sunil Srinivasan

I have enabled the SynonymGraphFilter in my field configuration in order to 
support multi-word synonyms (I am using Solr 7.6). Here is my field 
configuration:


  



  
  





And this is my synonyms.txt file:
frozen dinner,microwave food

Scenario 1: blue shirt (query with no synonyms)

Here is my first Solr query:
http://localhost:8983/solr/base/search?q=blue+shirt=title=edismax=on

And this is the parsed query I see in the debug output:
+((title:blue) (title:shirt))

Scenario 2: frozen dinner (query with synonyms)

Now, here is my second Solr query:
http://localhost:8983/solr/base/search?q=frozen+dinner=title=edismax=on

And this is the parsed query I see in the debug output:
+(((+title:microwave +title:food) (+title:frozen +title:dinner)))

I am wondering why the first query looks for documents containing at least one 
of the two query tokens, whereas the second query looks for documents with both 
of the query tokens? I would understand if it looked for both the tokens of the 
synonyms (i.e. both microwave and food) to avoid the sausagization problem. But 
I would like to get partial matches on the original query at least (i.e. it 
should also match documents containing just the token 'dinner').

Would any one know why the behavior is different across queries with and 
without synonyms? And how could I work around this if I wanted partial matches 
on queries that also have synonyms?

Ideally, I would like the parsed query in the second case to be:
+(((+title:microwave +title:food) (title:frozen title:dinner)))

I'd appreciate any help with this. Thanks!