hmm. Better use functions instead bf=query($qq)^20&qq=site:example.com
This will boost all example.com sites in the result set On Wednesday 31 August 2011 15:58:18 Johan Svensson wrote: > Thank you, Markus, > > At current rate, I just want this to work. I have no idea whether I want to > omitNorms or not. At the moment of writing, I don't feel like that, anyway. > More importantly, I want to boost pages which site field is www.example.com > over blog.example.com, but without omitting hits on blog.example.com. > > The query boost seems to filter out hits from blog.example.com completely, > so that is not what I want. > > Abusing the boost field might be a nice idea. Can you please show me an > example, presuming I don't really understand the connection between all the > xml files and binaries. Not even really which one of solr and nutch is > responsible for which task... :) > > 2011/8/31 Markus Jelsma <[email protected]> > > > Index-time boosting is not something very common and raises issues if you > > want > > to omitNorms in Solr. > > > > In Solr DisMax you can use a bq (boost query) to boost site:example.com > > ^10. > > All results that match the boost query receive a ^10 boost. This is only > > client side. > > > > You can also abuse the boost field Nutch is writing. By default this is > > 1.0f. > > You can write a simple scoring filter or even an indexing filter that > > check's > > the site field for your site and sets the boost field accordingly. > > > > On Wednesday 31 August 2011 15:30:08 Johan Svensson wrote: > > > I guess this is the solution. Though, I have been trying to implement > > > > this > > > > > the whole afternoon with no success. I have a field "site" in my > > > scheme.xml, stored and indexed. I'm using nutch -solrindex to tell solr > > > > to > > > > > index what nutch has crawled. How can I tell nutch to tell solr to > > > boost all documents with the value "www.example.com" of the "site" > > > field? An example would be perfect for a loser like myself. I've > > > googled all the Internets over and over. > > > > > > 2011/8/31 Gora Mohanty <[email protected]> > > > > > > > On Wed, Aug 31, 2011 at 2:51 PM, Johan Svensson > > > > > > > > <[email protected]> wrote: > > > > > Thank you! This looks interesting. However, I wonder if it really > > > > > can > > > > > > > > solve > > > > > > > > > this problem. No part of the search query is by necessary means > > > > > part > > > > of > > > > > > the > > > > > > > > > domain name. Let's say for example that we search for "foobar". On > > > > > www.example.com/page42.html this word is found, as well for lots of > > > > > > > > pages > > > > > > > > > with different names at blog.example.com/. Can you apply boosting > > > > magic > > > > > > for > > > > > > > > > the hit at www.example.com although the search term is not a part > > > > > of the url? > > > > > > > > Presumably, you know the domain name from which the > > > > document originates at indexing time. If so, you can use > > > > index-time boosting: > > > > http://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts > > > > E.g., this can be used to boost all documents from www.example.com > > > > over those from blog.example.com. > > > > > > > > Regards, > > > > Gora > > > > -- > > Markus Jelsma - CTO - Openindex > > http://www.linkedin.com/in/markus17 > > 050-8536620 / 06-50258350 -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

