Re: SOLR ranking

Binoy Dalal Thu, 18 Feb 2016 16:23:35 -0800

Hi Alessandro,
Don't get me wrong. Using mm, ps and pf can and absolutely will solve his
problem.


Like I said above, my solution is meant to be a quick and dirty fix. It's
really not that complex and shouldn't take more than an hour to setup at
the app level. Moreover I suggested it because he said it was urgent for
him and setting up a proper config with mm, pf and ps might take him much
longer.

Hope this clears things up :)

On Fri, 19 Feb 2016, 05:31 Alessandro Benedetti <abenede...@apache.org>
wrote:

> Hey Binoi ,
> can't understand why such complexity to be honest :/
> Can you explain me why playing with :
>
> edismax
> mm ( percentage of query terms you want to be in the results)
> pf ( the fields you want to be boosted if phrase matches )
> ps ( slop to allow)
>
> Should not solve the problem instead of the 2 phases query ?
>
> Cheers
>
> On 18 February 2016 at 18:09, Binoy Dalal <binoydala...@gmail.com> wrote:
>
> > Here's an alternative solution that may be of some help.
> > Here I'm assuming that you are not directly outputting the search results
> > to the user and have some sort of layer between the results from solr and
> > presentation to the user where some additional processing can be
> performed.
> >
> > 1) You already know that you want phrase matches to show up higher than
> > single matches. In this case, why not do an explicit phrase match first,
> > with some slop or as is based on how close you want the phrase terms be
> to
> > each other.
> > 2) Once you have the results from the first query, fire an OR query with
> > your terms and get those results.
> > 3) Put results from (2) after (1) and present to the user. This happens
> in
> > the app layer.
> >
> > This is essentially the same as running a query as such: "Rheumatoid
> > Arthritis"~slop OR (Rhuematoid AND Arthritis) but you don't need to worry
> > about the ordering because you're sorting your results.
> >
> > Now, this will obviously take more time since you're querying twice and
> > then doing the addtional processing in the app layer, but provided your
> > architecture is balanced enough and can cope with a little extra load, I
> do
> > not think that your performance will take that bad a hit. Moreover since
> > you're in a hurry, you could implement this as a quick and dirty solution
> > to meet the project goals, provided it fits the acceptance parameters and
> > then later play around with the scoring/sorting and figure out the best
> > possible setup to suit your needs.
> >
> > On Thu, Feb 18, 2016 at 4:22 PM Emir Arnautovic <
> > emir.arnauto...@sematext.com> wrote:
> >
> > > Hi Nitin,
> > > Can you send us how your parsed query looks like (from debug output).
> > >
> > > Thanks,
> > > Emir
> > >
> > > On 17.02.2016 08:38, Nitin.K wrote:
> > > > Hi Binoy,
> > > >
> > > > We are searching for both phrases and individual words
> > > > but we want that only those documents which are having phrases will
> > come
> > > > first in the order and then the individual app.
> > > >
> > > > termPositions = true is also not working in my case.
> > > >
> > > > I have also removed the string type from copy fields. kindly look
> into
> > > the
> > > > changed configuration below:
> > > >
> > > > Hi Emir,
> > > >
> > > > I have changed the cofiguration as per your suggestion, added pf2 /
> > pf3.
> > > > Yes, i saw the difference but still the ranking is not getting
> followed
> > > > correctly in case of phrases.
> > > >
> > > > Changed configuration;
> > > >
> > > > <field name="topic_title" type="text_general" indexed="true"
> > > stored="true"
> > > > />
> > > > <field name="topTitle" type="text_phrase" indexed="true"
> stored="false"
> > > />
> > > >
> > > > <field name="subtopic_title" type="text_general" indexed="true"
> > > > stored="true"/>
> > > > <field name="subTopTitle" type="text_phrase" indexed="true"
> > > stored="false"/>
> > > >
> > > > <field name="index_term" type="text_ws" indexed="true" stored="true"
> > > > multiValued="true"/>
> > > > <field name="indTerm" type="text_phrase" indexed="true"
> stored="false"
> > > > multiValued="true"/>
> > > >
> > > > <field name="drug" type="text_ws" indexed="true" stored="true"
> > > > multiValued="true"/>
> > > > <field name="drugString" type="text_phrase" indexed="true"
> > stored="false"
> > > > multiValued="true"/>
> > > >
> > > > <field name="tglData" type="text_phrase" indexed="true"
> > stored="false"/>
> > > >
> > > > Copy fields again for the reference :
> > > >
> > > > <copyField source="topic_title" dest="topTitle"/>
> > > > <copyField source="subtopic_title" dest="subTopTitle"/>
> > > > <copyField source="index_term" dest="indTerm"/>
> > > > <copyField source="drug" dest="drugString"/>
> > > > <copyField source="content" dest="tglData"/>
> > > >
> > > > Added following field type:
> > > >
> > > > <fieldType name="text_phrase" class="solr.TextField"
> > > > positionIncrementGap="100" omitNorms="true">
> > > >       <analyzer>
> > > >               <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> > > >               <filter class="solr.StopFilterFactory"
> ignoreCase="true"
> > > > words="stopwords.txt" />
> > > >               <filter class="solr.LowerCaseFilterFactory"/>
> > > >       </analyzer>
> > > > </fieldType>
> > > >
> > > > Removed the string type from the copy fields.
> > > >
> > > > Changed Query :
> > > >
> > > >
> > >
> >
> http://localhost:8983/solr/tgl/select?q=rheumatoid%20arthritis&wt=xml&tie=1.0&rows=200&q.op=AND&indent=true&defType=edismax&stopwords=true&lowercaseOperators=true&debugQuery=true&;
> > > > pf=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
> > > > pf2=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
> > > > pf3=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
> > > > qf=topic_title^100 subtopic_title^40 index_term^20 drug^15 content^3
> > > >
> > > > After making these changes, I am able to get my search results
> > correctly
> > > for
> > > > a single term but in case of phrase search, i am still not able to
> get
> > > the
> > > > results in the correct order.
> > > >
> > > > Hi Modassar,
> > > >
> > > > I tried using mm=100, but the order is still the same.
> > > >
> > > > Hi Alessandro,
> > > >
> > > > I have not yet tried the slope parameter. By default it is taking it
> as
> > > 1.0
> > > > when i looked it in debug mode. Will revert you definitely. So, let
> me
> > > try
> > > > this option too.
> > > >
> > > > All,
> > > >
> > > > Please suggest if anyone is having any other suggestion on this. I
> have
> > > to
> > > > implement it on urgent basis and i think i am very close to it.
> Thanks
> > > all
> > > > of you. I have reached to this level just because of you guys.
> > > >
> > > > Thanks and Regards,
> > > > Nitin
> > > >
> > > >
> > > >
> > > > --
> > > > View this message in context:
> > > http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257782.html
> > > > Sent from the Solr - User mailing list archive at Nabble.com.
> > >
> > > --
> > > Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> > > Solr & Elasticsearch Support * http://sematext.com/
> > >
> > > --
> > Regards,
> > Binoy Dalal
> >
>
>
>
> --
> --------------------------
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>
-- 
Regards,
Binoy Dalal

Re: SOLR ranking

Reply via email to