Re: PageRanking with DIH

2012-06-14 Thread Chris Hostetter

: I have computed pagerank offline for document set dump.  I ideally
: want to use pagerank and solr relevency score together in formula to
: sort search solr result.  I have already looked at
: 
http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_increase_the_score_for_specific_documents
: and found that indextimeboost is useful. I want to know how can I use
: indextimeboost ?

i would strongly suggest thta instead of using index time boost you use 
a boost function on a numeric field (the very next section of that 
SolrRelevancyFAQ)

I've updated the page to try and make this alternative method more 
obvious, and mentioned the use of ExternalFileField (for the case where 
you want to be able to update these rankings w/o reindexing)

http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_increase_the_score_for_specific_documents
http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_change_the_score_of_a_document_based_on_the_.2Avalue.2A_of_a_field_.28say.2C_.22popularity.22.29

-Hoss


PageRanking with DIH

2012-06-12 Thread vineet yadav
Hi,
I have indexed documents and computed pagerank for documents. I want
to update pagerank for indexed document and sort solr search result
with pagerank.
 I did some research and found that  index time boost can be used, but
I don't know how to use it. Can I boost document at index time  with
DIH ? Can anybody help me in this regard ? Can I use Solr relevancy
score with PageRanking score to sort search result?  Any suggestions
are welcome
!!
Thanks


Re: PageRanking with DIH

2012-06-12 Thread Gora Mohanty
On 12 June 2012 13:04, vineet yadav vineet.yadav.i...@gmail.com wrote:
 Hi,
 I have indexed documents and computed pagerank for documents. I want
 to update pagerank for indexed document and sort solr search result
 with pagerank.

Your question is not entirely clear: What is pagerank in this case?
A custom score that you can compute at indexing time, and by
which you want to order retrieved results? If so, just add a pagerank
field to your Solr records, ignore Solr's order, and instead sort results
by that field.

  I did some research and found that  index time boost can be used, but
 I don't know how to use it. Can I boost document at index time  with
 DIH ? Can anybody help me in this regard ? Can I use Solr relevancy
 score with PageRanking score to sort search result?  Any suggestions
 are welcome

This is confused: Do you want your pagerank as the sole basis for
the ranking of returned results, or do you want it to be one of multiple
(weighted) criteria? Maybe you should read
http://wiki.apache.org/solr/SolrRelevancyFAQ

Regards,
Gora


Re: PageRanking with DIH

2012-06-12 Thread vineet yadav
Hi Gora,
Thanks for reply.
I have computed pagerank offline for document set dump.  I ideally
want to use pagerank and solr relevency score together in formula to
sort search solr result.  I have already looked at
http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_increase_the_score_for_specific_documents
and found that indextimeboost is useful. I want to know how can I use
indextimeboost ?
Thanks
Vineet Yadav

On Tue, Jun 12, 2012 at 1:32 PM, Gora Mohanty g...@mimirtech.com wrote:
 On 12 June 2012 13:04, vineet yadav vineet.yadav.i...@gmail.com wrote:
 Hi,
 I have indexed documents and computed pagerank for documents. I want
 to update pagerank for indexed document and sort solr search result
 with pagerank.

 Your question is not entirely clear: What is pagerank in this case?
 A custom score that you can compute at indexing time, and by
 which you want to order retrieved results? If so, just add a pagerank
 field to your Solr records, ignore Solr's order, and instead sort results
 by that field.

  I did some research and found that  index time boost can be used, but
 I don't know how to use it. Can I boost document at index time  with
 DIH ? Can anybody help me in this regard ? Can I use Solr relevancy
 score with PageRanking score to sort search result?  Any suggestions
 are welcome

 This is confused: Do you want your pagerank as the sole basis for
 the ranking of returned results, or do you want it to be one of multiple
 (weighted) criteria? Maybe you should read
 http://wiki.apache.org/solr/SolrRelevancyFAQ

 Regards,
 Gora


Re: PageRanking with DIH

2012-06-12 Thread Gora Mohanty
On 12 June 2012 13:51, vineet yadav vineet.yadav.i...@gmail.com wrote:
 Hi Gora,
 Thanks for reply.
 I have computed pagerank offline for document set dump.  I ideally
 want to use pagerank and solr relevency score together in formula to
 sort search solr result.  I have already looked at
 http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_increase_the_score_for_specific_documents
 and found that indextimeboost is useful. I want to know how can I use
 indextimeboost ?
[...]

That depends on how you are indexing data into Solr.
That page explains how to do index-time boosting at
record level, and field level for XML documents uploaded
to Solr with post.sh.

If you are using the Solr DataImportHandler, you can boost
records, but not individual fields, as far as I am aware. Please
take a look at this thread for an example:
http://lucene.472066.n3.nabble.com/Index-time-boosting-with-DIH-td3206271.html

It would help if you did some basic groundwork, tried out things
for yourselves, and asked more specific questions. You might
wish to read http://wiki.apache.org/solr/UsingMailingLists

Regards,
Gora


RE: PageRanking with DIH

2012-06-12 Thread Dyer, James
To boost a document with DIH, see this section about $docBoost in the wiki 
here:  http://wiki.apache.org/solr/DataImportHandler#Special_Commands.

If you're using a RDBMS for source data, your query would have something like 
this in it: select PAGE_RANK as '$docBoost', ... from ... etc

If you don't want to boost entire documents but have it be very flexible at 
query time, see the page on Extended Dismax, especially the boost function 
section:  
http://wiki.apache.org/solr/ExtendedDisMax?highlight=%28edismax%29#bf_.28Boost_Function.2C_additive.29
 .  Also, the Packt Solr book (SmileyPugh) has a nice section about boosting 
scores based on page-rank or popularity type fields.  In the old first edition 
its chapter 5, enhanced searching.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Gora Mohanty [mailto:g...@mimirtech.com] 
Sent: Tuesday, June 12, 2012 4:54 AM
To: solr-user@lucene.apache.org
Subject: Re: PageRanking with DIH

On 12 June 2012 13:51, vineet yadav vineet.yadav.i...@gmail.com wrote:
 Hi Gora,
 Thanks for reply.
 I have computed pagerank offline for document set dump.  I ideally
 want to use pagerank and solr relevency score together in formula to
 sort search solr result.  I have already looked at
 http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_increase_the_score_for_specific_documents
 and found that indextimeboost is useful. I want to know how can I use
 indextimeboost ?
[...]

That depends on how you are indexing data into Solr.
That page explains how to do index-time boosting at
record level, and field level for XML documents uploaded
to Solr with post.sh.

If you are using the Solr DataImportHandler, you can boost
records, but not individual fields, as far as I am aware. Please
take a look at this thread for an example:
http://lucene.472066.n3.nabble.com/Index-time-boosting-with-DIH-td3206271.html

It would help if you did some basic groundwork, tried out things
for yourselves, and asked more specific questions. You might
wish to read http://wiki.apache.org/solr/UsingMailingLists

Regards,
Gora