[ 
https://issues.apache.org/jira/browse/SOLR-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15163105#comment-15163105
 ] 

Christine Poerschke commented on SOLR-8542:
-------------------------------------------

Question related to [Feature 
Engineering|https://en.wikipedia.org/wiki/Feature_engineering] - is that the 
right term? - and feature extraction.

[LTRQParserPlugin.java#L117|https://github.com/bloomberg/lucene-solr/blob/master-ltr-plugin-rfc/solr/contrib/ltr/src/java/org/apache/solr/ltr/ranking/LTRQParserPlugin.java#L117]
 mentions

bq. For training a new model offline you need feature vectors, but dont yet 
have a model.

and 
[README.txt#L280|https://github.com/bloomberg/lucene-solr/blob/master-ltr-plugin-rfc/solr/contrib/ltr/README.txt#L280]
 mentions about for now using a dummy model e.g.

bq. fv=true&fl=*,score,\[features\]&rq={!ltr model=dummyModel reRankDocs=25}

to extract features.

If it is known already, could you outline what the replacement for the above 
fv/fl/dummyModel combination is likely to look like?

Semi-related to that:
* would the {{efi.*}} parameters move out of the {{rq}} then since candidate 
features to be returned in the response might reference external feature info?
* might it be useful to have optional version and/or comment string elements in 
the feature and model JSON format? Illustration:
{code}
{
  "type": "org.apache.solr.ltr.feature.impl.SolrFeature",
  "name":  "documentRecency",
  "comment": "Initial version, we may have to tweak the recip function 
arguments later.",
  "params": {
      "q": "{!func}recip( ms(NOW,publish_date), 3.16e-11, 1, 1)"
  }
}
...
{
    "type":"org.apache.solr.ltr.ranking.RankSVMModel",
    "name":"myModelName",
    "version": "1.0",
    "comment": "features and parameters determined using XYZ with ABC data, 
ticket reference: 12345",
    "features":[
        ...
    ],
    "params":{
        ...
    }
}
{code}

> Integrate Learning to Rank into Solr
> ------------------------------------
>
>                 Key: SOLR-8542
>                 URL: https://issues.apache.org/jira/browse/SOLR-8542
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Joshua Pantony
>            Assignee: Christine Poerschke
>            Priority: Minor
>         Attachments: README.md, README.md, SOLR-8542-branch_5x.patch, 
> SOLR-8542-trunk.patch
>
>
> This is a ticket to integrate learning to rank machine learning models into 
> Solr. Solr Learning to Rank (LTR) provides a way for you to extract features 
> directly inside Solr for use in training a machine learned model. You can 
> then deploy that model to Solr and use it to rerank your top X search 
> results. This concept was previously presented by the authors at Lucene/Solr 
> Revolution 2015 ( 
> http://www.slideshare.net/lucidworks/learning-to-rank-in-solr-presented-by-michael-nilsson-diego-ceccarelli-bloomberg-lp
>  ).
> The attached code was jointly worked on by Joshua Pantony, Michael Nilsson, 
> David Grohmann and Diego Ceccarelli.
> Any chance this could make it into a 5x release? We've also attached 
> documentation as a github MD file, but are happy to convert to a desired 
> format.
> h3. Test the plugin with solr/example/techproducts in 6 steps
> Solr provides some simple example of indices. In order to test the plugin 
> with 
> the techproducts example please follow these steps
> h4. 1. compile solr and the examples 
> cd solr
> ant dist
> ant example
> h4. 2. run the example
> ./bin/solr -e techproducts 
> h4. 3. stop it and install the plugin:
>    
> ./bin/solr stop
> mkdir example/techproducts/solr/techproducts/lib
> cp build/contrib/ltr/lucene-ltr-6.0.0-SNAPSHOT.jar 
> example/techproducts/solr/techproducts/lib/
> cp contrib/ltr/example/solrconfig.xml 
> example/techproducts/solr/techproducts/conf/
> h4. 4. run the example again
>     
> ./bin/solr -e techproducts
> h4. 5. index some features and a model
> curl -XPUT 'http://localhost:8983/solr/techproducts/schema/fstore'  
> --data-binary "@./contrib/ltr/example/techproducts-features.json"  -H 
> 'Content-type:application/json'
> curl -XPUT 'http://localhost:8983/solr/techproducts/schema/mstore'  
> --data-binary "@./contrib/ltr/example/techproducts-model.json"  -H 
> 'Content-type:application/json'
> h4. 6. have fun !
> *access to the default feature store*
> http://localhost:8983/solr/techproducts/schema/fstore/_DEFAULT_ 
> *access to the model store*
> http://localhost:8983/solr/techproducts/schema/mstore
> *perform a query using the model, and retrieve the features*
> http://localhost:8983/solr/techproducts/query?indent=on&q=test&wt=json&rq={!ltr%20model=svm%20reRankDocs=25%20efi.query=%27test%27}&fl=*,[features],price,score,name&fv=true



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to