Re: Weight servers differently

Markus Jelsma Wed, 31 Aug 2011 07:25:27 -0700

hmm. Better use functions instead

bf=query($qq)^20&qq=site:example.com


This will boost all example.com sites in the result set

On Wednesday 31 August 2011 15:58:18 Johan Svensson wrote:
> Thank you, Markus,
> 
> At current rate, I just want this to work. I have no idea whether I want to
> omitNorms or not. At the moment of writing, I don't feel like that, anyway.
> More importantly, I want to boost pages which site field is www.example.com
>  over blog.example.com, but without omitting hits on blog.example.com.
> 
> The query boost seems to filter out hits from blog.example.com completely,
> so that is not what I want.
> 
> Abusing the boost field might be a nice idea. Can you please show me an
> example, presuming I don't really understand the connection between all the
> xml files and binaries. Not even really which one of solr and nutch is
> responsible for which task... :)
> 
> 2011/8/31 Markus Jelsma <[email protected]>
> 
> > Index-time boosting is not something very common and raises issues if you
> > want
> > to omitNorms in Solr.
> > 
> > In Solr DisMax you can use a bq (boost query) to boost site:example.com
> > ^10.
> > All results that match the boost query receive a ^10 boost. This is only
> > client side.
> > 
> > You can also abuse the boost field Nutch is writing. By default this is
> > 1.0f.
> > You can write a simple scoring filter or even an indexing filter that
> > check's
> > the site field for your site and sets the boost field accordingly.
> > 
> > On Wednesday 31 August 2011 15:30:08 Johan Svensson wrote:
> > > I guess this is the solution. Though, I have been trying to implement
> > 
> > this
> > 
> > > the whole afternoon with no success. I have a field "site" in my
> > > scheme.xml, stored and indexed. I'm using nutch -solrindex to tell solr
> > 
> > to
> > 
> > > index what nutch has crawled. How can I tell nutch to tell solr to
> > > boost all documents with the value "www.example.com" of the "site"
> > > field? An example would be perfect for a loser like myself. I've
> > > googled all the Internets over and over.
> > > 
> > > 2011/8/31 Gora Mohanty <[email protected]>
> > > 
> > > > On Wed, Aug 31, 2011 at 2:51 PM, Johan Svensson
> > > > 
> > > > <[email protected]> wrote:
> > > > > Thank you! This looks interesting. However, I wonder if it really
> > > > > can
> > > > 
> > > > solve
> > > > 
> > > > > this problem. No part of the search query is by necessary means
> > > > > part
> > 
> > of
> > 
> > > > the
> > > > 
> > > > > domain name. Let's say for example that we search for "foobar". On
> > > > > www.example.com/page42.html this word is found, as well for lots of
> > > > 
> > > > pages
> > > > 
> > > > > with different names at blog.example.com/. Can you apply boosting
> > 
> > magic
> > 
> > > > for
> > > > 
> > > > > the hit at www.example.com although the search term is not a part
> > > > > of the url?
> > > > 
> > > > Presumably, you know the domain name from which the
> > > > document originates at indexing time. If so, you can use
> > > > index-time boosting:
> > > > http://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts
> > > > E.g., this can be used to boost all documents from www.example.com
> > > > over those from blog.example.com.
> > > > 
> > > > Regards,
> > > > Gora
> > 
> > --
> > Markus Jelsma - CTO - Openindex
> > http://www.linkedin.com/in/markus17
> > 050-8536620 / 06-50258350

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: Weight servers differently

Reply via email to