[ 
https://issues.apache.org/jira/browse/SOLR-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15256886#comment-15256886
 ] 

Joshua Pantony commented on SOLR-8542:
--------------------------------------

Hi, thanks for the interest! Was there a specific algorithm you had in mind 
that is currently not supported? Often it is possible to formulate comparisons 
in the training phase in such a way that you can still compare just one score 
in the live phase. Lets use rankSVM (a pairwise approach) as an example. Given 
documents D1 and D2, the feature vector represented by the function V(D), if we 
know that D1 > D2, we can formulate this in the training stage as the objective 
function (V(D1) - V(D2)) * W  > 0 . Here we have created an objective function 
by directly comparing pairs of documents D1 and D2, hence it is pairwise. In 
the live phase given documents D1, D2, D3 and D4 we "could" do a direct 
pairwise approach aka:

(V(D1) - V(D2)) * W > 0 ?,
(V(D1) - V(D3)) * W > 0 ?,
(V(D1) - V(D4)) * W > 0 ?,
(V(D2) - V(D3)) * W > 0 ?,
(V(D2) - V(D4)) * W > 0 ?,
(V(D3) - V(D4)) * W > 0 ?

However this is computationally inefficient. In this case if we do a direct 
comparison using our original objective function that we trained on, we'd need 
to do 6 dot products. Using some basic math, in the live phase we can change 
(D1 - D2) * W > 0 to V(D1) * W > V(D2) * W . Now all I need to do in a live 
setting is calculate V(D1) * W, V(D2) * W, V(D3) * W, V(D4) * W . Once we do 
that we can just sort the numbers and volla we've done pairwise comparisons in 
the same time complexity as a pointwise approach. Of course don't trust me, 
read this paper: 
http://www.cs.cornell.edu/people/tj/publications/joachims_02c.pdf (note I 
vastly simplified rank SVM here for ease of dialogue). 

So all that being said, I'll circle back to my original question, was there a 
specific algorithm you had in mind that we don't easily support? If so happy to 
add it in some future patch (no promise on when though  ). [should be noted 
there is some debate / grey area around if lambdaMART is listwise or pairwise 
but it is generally considered among the strongest performing methods]

> Integrate Learning to Rank into Solr
> ------------------------------------
>
>                 Key: SOLR-8542
>                 URL: https://issues.apache.org/jira/browse/SOLR-8542
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Joshua Pantony
>            Assignee: Christine Poerschke
>            Priority: Minor
>         Attachments: README.md, README.md, SOLR-8542-branch_5x.patch, 
> SOLR-8542-trunk.patch
>
>
> This is a ticket to integrate learning to rank machine learning models into 
> Solr. Solr Learning to Rank (LTR) provides a way for you to extract features 
> directly inside Solr for use in training a machine learned model. You can 
> then deploy that model to Solr and use it to rerank your top X search 
> results. This concept was previously presented by the authors at Lucene/Solr 
> Revolution 2015 ( 
> http://www.slideshare.net/lucidworks/learning-to-rank-in-solr-presented-by-michael-nilsson-diego-ceccarelli-bloomberg-lp
>  ).
> The attached code was jointly worked on by Joshua Pantony, Michael Nilsson, 
> David Grohmann and Diego Ceccarelli.
> Any chance this could make it into a 5x release? We've also attached 
> documentation as a github MD file, but are happy to convert to a desired 
> format.
> h3. Test the plugin with solr/example/techproducts in 6 steps
> Solr provides some simple example of indices. In order to test the plugin 
> with 
> the techproducts example please follow these steps
> h4. 1. compile solr and the examples 
> cd solr
> ant dist
> ant example
> h4. 2. run the example
> ./bin/solr -e techproducts 
> h4. 3. stop it and install the plugin:
>    
> ./bin/solr stop
> mkdir example/techproducts/solr/techproducts/lib
> cp build/contrib/ltr/lucene-ltr-6.0.0-SNAPSHOT.jar 
> example/techproducts/solr/techproducts/lib/
> cp contrib/ltr/example/solrconfig.xml 
> example/techproducts/solr/techproducts/conf/
> h4. 4. run the example again
>     
> ./bin/solr -e techproducts
> h4. 5. index some features and a model
> curl -XPUT 'http://localhost:8983/solr/techproducts/schema/fstore'  
> --data-binary "@./contrib/ltr/example/techproducts-features.json"  -H 
> 'Content-type:application/json'
> curl -XPUT 'http://localhost:8983/solr/techproducts/schema/mstore'  
> --data-binary "@./contrib/ltr/example/techproducts-model.json"  -H 
> 'Content-type:application/json'
> h4. 6. have fun !
> *access to the default feature store*
> http://localhost:8983/solr/techproducts/schema/fstore/_DEFAULT_ 
> *access to the model store*
> http://localhost:8983/solr/techproducts/schema/mstore
> *perform a query using the model, and retrieve the features*
> http://localhost:8983/solr/techproducts/query?indent=on&q=test&wt=json&rq={!ltr%20model=svm%20reRankDocs=25%20efi.query=%27test%27}&fl=*,[features],price,score,name&fv=true



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to