Re: SOLR ranking

Alessandro Benedetti Thu, 18 Feb 2016 16:02:23 -0800

Hey Binoi ,
can't understand why such complexity to be honest :/
Can you explain me why playing with :


edismax
mm ( percentage of query terms you want to be in the results)
pf ( the fields you want to be boosted if phrase matches )
ps ( slop to allow)

Should not solve the problem instead of the 2 phases query ?

Cheers

On 18 February 2016 at 18:09, Binoy Dalal <binoydala...@gmail.com> wrote:

> Here's an alternative solution that may be of some help.
> Here I'm assuming that you are not directly outputting the search results
> to the user and have some sort of layer between the results from solr and
> presentation to the user where some additional processing can be performed.
>
> 1) You already know that you want phrase matches to show up higher than
> single matches. In this case, why not do an explicit phrase match first,
> with some slop or as is based on how close you want the phrase terms be to
> each other.
> 2) Once you have the results from the first query, fire an OR query with
> your terms and get those results.
> 3) Put results from (2) after (1) and present to the user. This happens in
> the app layer.
>
> This is essentially the same as running a query as such: "Rheumatoid
> Arthritis"~slop OR (Rhuematoid AND Arthritis) but you don't need to worry
> about the ordering because you're sorting your results.
>
> Now, this will obviously take more time since you're querying twice and
> then doing the addtional processing in the app layer, but provided your
> architecture is balanced enough and can cope with a little extra load, I do
> not think that your performance will take that bad a hit. Moreover since
> you're in a hurry, you could implement this as a quick and dirty solution
> to meet the project goals, provided it fits the acceptance parameters and
> then later play around with the scoring/sorting and figure out the best
> possible setup to suit your needs.
>
> On Thu, Feb 18, 2016 at 4:22 PM Emir Arnautovic <
> emir.arnauto...@sematext.com> wrote:
>
> > Hi Nitin,
> > Can you send us how your parsed query looks like (from debug output).
> >
> > Thanks,
> > Emir
> >
> > On 17.02.2016 08:38, Nitin.K wrote:
> > > Hi Binoy,
> > >
> > > We are searching for both phrases and individual words
> > > but we want that only those documents which are having phrases will
> come
> > > first in the order and then the individual app.
> > >
> > > termPositions = true is also not working in my case.
> > >
> > > I have also removed the string type from copy fields. kindly look into
> > the
> > > changed configuration below:
> > >
> > > Hi Emir,
> > >
> > > I have changed the cofiguration as per your suggestion, added pf2 /
> pf3.
> > > Yes, i saw the difference but still the ranking is not getting followed
> > > correctly in case of phrases.
> > >
> > > Changed configuration;
> > >
> > > <field name="topic_title" type="text_general" indexed="true"
> > stored="true"
> > > />
> > > <field name="topTitle" type="text_phrase" indexed="true" stored="false"
> > />
> > >
> > > <field name="subtopic_title" type="text_general" indexed="true"
> > > stored="true"/>
> > > <field name="subTopTitle" type="text_phrase" indexed="true"
> > stored="false"/>
> > >
> > > <field name="index_term" type="text_ws" indexed="true" stored="true"
> > > multiValued="true"/>
> > > <field name="indTerm" type="text_phrase" indexed="true" stored="false"
> > > multiValued="true"/>
> > >
> > > <field name="drug" type="text_ws" indexed="true" stored="true"
> > > multiValued="true"/>
> > > <field name="drugString" type="text_phrase" indexed="true"
> stored="false"
> > > multiValued="true"/>
> > >
> > > <field name="tglData" type="text_phrase" indexed="true"
> stored="false"/>
> > >
> > > Copy fields again for the reference :
> > >
> > > <copyField source="topic_title" dest="topTitle"/>
> > > <copyField source="subtopic_title" dest="subTopTitle"/>
> > > <copyField source="index_term" dest="indTerm"/>
> > > <copyField source="drug" dest="drugString"/>
> > > <copyField source="content" dest="tglData"/>
> > >
> > > Added following field type:
> > >
> > > <fieldType name="text_phrase" class="solr.TextField"
> > > positionIncrementGap="100" omitNorms="true">
> > >       <analyzer>
> > >               <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> > >               <filter class="solr.StopFilterFactory" ignoreCase="true"
> > > words="stopwords.txt" />
> > >               <filter class="solr.LowerCaseFilterFactory"/>
> > >       </analyzer>
> > > </fieldType>
> > >
> > > Removed the string type from the copy fields.
> > >
> > > Changed Query :
> > >
> > >
> >
> http://localhost:8983/solr/tgl/select?q=rheumatoid%20arthritis&wt=xml&tie=1.0&rows=200&q.op=AND&indent=true&defType=edismax&stopwords=true&lowercaseOperators=true&debugQuery=true&;
> > > pf=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
> > > pf2=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
> > > pf3=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
> > > qf=topic_title^100 subtopic_title^40 index_term^20 drug^15 content^3
> > >
> > > After making these changes, I am able to get my search results
> correctly
> > for
> > > a single term but in case of phrase search, i am still not able to get
> > the
> > > results in the correct order.
> > >
> > > Hi Modassar,
> > >
> > > I tried using mm=100, but the order is still the same.
> > >
> > > Hi Alessandro,
> > >
> > > I have not yet tried the slope parameter. By default it is taking it as
> > 1.0
> > > when i looked it in debug mode. Will revert you definitely. So, let me
> > try
> > > this option too.
> > >
> > > All,
> > >
> > > Please suggest if anyone is having any other suggestion on this. I have
> > to
> > > implement it on urgent basis and i think i am very close to it. Thanks
> > all
> > > of you. I have reached to this level just because of you guys.
> > >
> > > Thanks and Regards,
> > > Nitin
> > >
> > >
> > >
> > > --
> > > View this message in context:
> > http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257782.html
> > > Sent from the Solr - User mailing list archive at Nabble.com.
> >
> > --
> > Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> > Solr & Elasticsearch Support * http://sematext.com/
> >
> > --
> Regards,
> Binoy Dalal
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Re: SOLR ranking

Reply via email to