Re: Regarding multi keyword search

2018-10-23 Thread Shawn Heisey

On 10/23/2018 8:20 AM, Gauri Dhawan wrote:

I have been facing an issue for quite some time and haven't been able to
come to a solution as of yet. We are trying to implement search on our
platform and all our data is stored in Solr.

I have a field `description` which is the field where I have to search.
It is of the field type `text_edit_suggest` and it looks something like this



   



 



   



  



 



When I search for multiple keywords, the results are unexpected.
For example :
I want to search for the words `first` and `post` and both these words
should be present in the description field of the document else it
shouldn't return the document.


Your index analysis has two tokenizers.  You can only have one.  There 
is at least one typo in the fieldType definition provided.  After I fix 
that, Solr 7.5.0 won't load the core, with this as the error:


Plugin init failure for [schema.xml] fieldType "text_suggest_edge": 
Plugin init failure for [schema.xml] analyzer/tokenizer: The schema 
defines multiple tokenizers for: [tokenizer: null]


What version of Solr are you running?  Have you explicitly included the 
"sow" parameter on your query, or in the handler definition?


The KeywordTokenizerFactory that you're using probably doesn't do what 
you think it does.  It preserves the entire input as a single token -- 
doesn't split it into separate words.  The kind of searching you 
mentioned likely isn't possible with the analysis chain you've got.  It 
might take a bunch of back and forth question/answer cycles to get to 
something useful.


In my strong opinion, that KeywordTokenizerFactory has a terrible name 
and needs a new one.  Anyone want to bikeshed the possibilities?


Thanks,
Shawn



Re: Regarding multi keyword search

2018-10-23 Thread Walter Underwood
100% on mm with dangerous. If there is one misspelled or wrong word, there are 
zero matches.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Oct 23, 2018, at 8:25 AM, ANNAMANENI RAVEENDRA  
> wrote:
> 
> You should use mm parameter and it should be set to 100 if you use dismax
> or edismax
> 
> 
> On Tue, Oct 23, 2018 at 11:18 AM Gauri Dhawan 
> wrote:
> 
>> Hi!
>> I have been facing an issue for quite some time and haven't been able to
>> come to a solution as of yet. We are trying to implement search on our
>> platform and all our data is stored in Solr.
>> 
>> I have a field `description` which is the field where I have to search.
>> It is of the field type `text_edit_suggest` and it looks something like
>> this
>> 
>> 
>>>  
>>>
>>>
>>>
>>>>> pattern="([\.,;:-_])" replacement=" " replace="all"/>
>>>>> minGramSize="1"/>
>>>>> pattern="([^\w\d\*æøåÆØÅ ])" replacement="" replace="all"/>
>>>>> ignoreCase="true" expand="false"/>
>>>  
>>>  
>>>  
>>>  
>>>
>>>
>>>
>>>>> pattern="([\.,;:-_])" replacement=" " replace="all"/>
>>>>> pattern="([^\w\d\*æøåÆØÅ ])" replacement="" replace="all"/>
>>>>> pattern="^(.{30})(.*)?" replacement="$1" replace="all"/>
>>>>> ignoreCase="true" expand="false"/>
>>>  
>> 
>> 
>> 
>> When I search for multiple keywords, the results are unexpected.
>> For example :
>> I want to search for the words `first` and `post` and both these words
>> should be present in the description field of the document else it
>> shouldn't return the document.
>> I've tried some 50+ queries for this. Used `edismax` parser as well but in
>> vain.
>> 
>> Tried boosting as well. But most queries result in weight given to either
>> one of the keywords and results in documents that have that keyword but not
>> the other. Can you guys help? Thanks in advance!
>> 
>> 
>> Gauri Dhawan
>> Associate Software Engineer
>> SHEROES
>> 



Re: Regarding multi keyword search

2018-10-23 Thread ANNAMANENI RAVEENDRA
You should use mm parameter and it should be set to 100 if you use dismax
or edismax


On Tue, Oct 23, 2018 at 11:18 AM Gauri Dhawan 
wrote:

> Hi!
> I have been facing an issue for quite some time and haven't been able to
> come to a solution as of yet. We are trying to implement search on our
> platform and all our data is stored in Solr.
>
> I have a field `description` which is the field where I have to search.
> It is of the field type `text_edit_suggest` and it looks something like
> this
>
> 
> >   
> > 
> > 
> > 
> >  > pattern="([\.,;:-_])" replacement=" " replace="all"/>
> >  > minGramSize="1"/>
> >  > pattern="([^\w\d\*æøåÆØÅ ])" replacement="" replace="all"/>
> >  > ignoreCase="true" expand="false"/>
> >   
> >   
> >   
> >   
> > 
> > 
> > 
> >  > pattern="([\.,;:-_])" replacement=" " replace="all"/>
> >  > pattern="([^\w\d\*æøåÆØÅ ])" replacement="" replace="all"/>
> >  > pattern="^(.{30})(.*)?" replacement="$1" replace="all"/>
> >  > ignoreCase="true" expand="false"/>
> >   
>
>
>
> When I search for multiple keywords, the results are unexpected.
> For example :
> I want to search for the words `first` and `post` and both these words
> should be present in the description field of the document else it
> shouldn't return the document.
> I've tried some 50+ queries for this. Used `edismax` parser as well but in
> vain.
>
> Tried boosting as well. But most queries result in weight given to either
> one of the keywords and results in documents that have that keyword but not
> the other. Can you guys help? Thanks in advance!
>
>
> Gauri Dhawan
> Associate Software Engineer
> SHEROES
>