[GitHub] [lucene-solr] cpoerschke commented on a change in pull request #1571: SOLR-14560: Interleaving for Learning To Rank

GitBox Mon, 09 Nov 2020 08:46:20 -0800


cpoerschke commented on a change in pull request #1571:
URL: https://github.com/apache/lucene-solr/pull/1571#discussion_r519944462




##########
File path: solr/solr-ref-guide/src/learning-to-rank.adoc
##########
@@ -247,6 +254,81 @@ The output XML will include feature values as a 
comma-separated list, resembling
   }}
 ----
 
+=== Running a Rerank Query Interleaving Two Models
+
+To rerank the results of a query, interleaving two models (myModelA, myModelB) 
add the `rq` parameter to your search, passing two models in input, for example:
+
+[source,text]
+http://localhost:8983/solr/techproducts/query?q=test&rq={!ltr model=myModelA 
model=myModelB reRankDocs=100}&fl=id,score
+
+To obtain the model that interleaving picked for a search result, computed 
during reranking, add `[interleaving]` to the `fl` parameter, for example:

Review comment:
       question: if myModelA had `[ doc1, doc2, doc3 ]` document order and 
myModelB had `[ doc1, doc3, doc2 ]` document order i.e. there was agreement 
between the models re: the first document, will `[interleaving]` return (1) 
randomly `myModelA` or `myModelB` depending on how the picking actually 
happened or will it return (2) something else e.g. `myModelA,myModelB` (if 
myModelA actually picked and myModelB agreed) or `myModelB,myModelA` (if 
myModelB actually picked and myModelA agreed) or will it return (3) neither 
since in a way neither of them picked the document since they both agreed on it?
   
   answer-ish: from recalling the implementation the answer is (1) i think 
though from a user's perspective perhaps it might be nice here to clarify here 
somehow around that? a subtle aspect being (if i understand things right) that 
`[features]` and `[interleaving]` could both be requested in the `fl` and 
whilst myModelA and myModelB might have agreed that `doc1` should be the first 
document they might have used very different features to arrived at that 
conclusion and their `score` value could also differ.

##########
File path: solr/solr-ref-guide/src/learning-to-rank.adoc
##########
@@ -247,6 +254,81 @@ The output XML will include feature values as a 
comma-separated list, resembling
   }}
 ----
 
+=== Running a Rerank Query Interleaving Two Models
+
+To rerank the results of a query, interleaving two models (myModelA, myModelB) 
add the `rq` parameter to your search, passing two models in input, for example:
+
+[source,text]
+http://localhost:8983/solr/techproducts/query?q=test&rq={!ltr model=myModelA 
model=myModelB reRankDocs=100}&fl=id,score
+
+To obtain the model that interleaving picked for a search result, computed 
during reranking, add `[interleaving]` to the `fl` parameter, for example:
+
+[source,text]
+http://localhost:8983/solr/techproducts/query?q=test&rq={!ltr model=myModelA 
model=myModelB reRankDocs=100}&fl=id,score,[interleaving]
+
+The output XML will include the model picked for each search result, 
resembling the output shown here:
+
+[source,json]
+----
+{
+  "responseHeader":{
+    "status":0,
+    "QTime":0,
+    "params":{
+      "q":"test",
+      "fl":"id,score,[interleaving]",
+      "rq":"{!ltr model=myModelA model=myModelB reRankDocs=100}"}},
+  "response":{"numFound":2,"start":0,"maxScore":1.0005897,"docs":[
+      {
+        "id":"GB18030TEST",
+        "score":1.0005897,
+        "[interleaving]":"myModelB"},
+      {
+        "id":"UTF8TEST",
+        "score":0.79656565,
+        "[interleaving]":"myModelA"}]
+  }}
+----
+
+=== Running a Rerank Query Interleaving a model with the original ranking
+When approaching Search Quality Evaluation with interleaving it may be useful 
to compare a model with the original ranking. 
+To rerank the results of a query, interleaving a model with the original 
ranking, add the `rq` parameter to your search, with a model in input and 
activating the original ranking interleaving, for example:
+
+
+[source,text]
+http://localhost:8983/solr/techproducts/query?q=test&rq={!ltr model=myModel 
model=_OriginalRanking_ reRankDocs=100}&fl=id,score

Review comment:
       subjective: might `model=_OriginalRanking_ model=myModel` be more 
intuitive i.e. the 'from' baseline model on the left and the 'to' alternative 
model on the right? (i recall that the code had an "original ranking last" 
assumption before but if that's gone there's a possibility here to swap the 
order)

##########
File path: solr/solr-ref-guide/src/learning-to-rank.adoc
##########
@@ -418,6 +500,14 @@ Learning-To-Rank is a contrib module and therefore its 
plugins must be configure
 </transformer>
 ----
 
+* Declaration of the `[interleaving]` transformer.
++
+[source,xml]
+----
+<transformer name="interleaving" 
class="org.apache.solr.ltr.response.transform.LTRInterleavingTransformerFactory">
+</transformer>

Review comment:
       minor/subjective: could shorten since there's no parameters
   
   ```
   <transformer name="interleaving" 
class="org.apache.solr.ltr.response.transform.LTRInterleavingTransformerFactory"/>
   ```

##########
File path: 
solr/contrib/ltr/src/java/org/apache/solr/ltr/interleaving/TeamDraftInterleaving.java
##########
@@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.ltr.interleaving;
+
+import java.util.ArrayList;
+import java.util.HashSet;
+import java.util.LinkedHashSet;
+import java.util.Random;
+import java.util.Set;
+
+import org.apache.lucene.search.ScoreDoc;
+
+public class TeamDraftInterleaving implements Interleaving{
+  public static Random RANDOM;
+
+  static {
+    // We try to make things reproducible in the context of our tests by 
initializing the random instance
+    // based on the current seed
+    String seed = System.getProperty("tests.seed");

Review comment:
       Ah, precedent and existing use already, good to know, thanks for sharing!

##########
File path: solr/solr-ref-guide/src/learning-to-rank.adoc
##########
@@ -247,6 +254,81 @@ The output XML will include feature values as a 
comma-separated list, resembling
   }}
 ----
 
+=== Running a Rerank Query Interleaving Two Models
+
+To rerank the results of a query, interleaving two models (myModelA, myModelB) 
add the `rq` parameter to your search, passing two models in input, for example:
+
+[source,text]
+http://localhost:8983/solr/techproducts/query?q=test&rq={!ltr model=myModelA 
model=myModelB reRankDocs=100}&fl=id,score
+
+To obtain the model that interleaving picked for a search result, computed 
during reranking, add `[interleaving]` to the `fl` parameter, for example:
+
+[source,text]
+http://localhost:8983/solr/techproducts/query?q=test&rq={!ltr model=myModelA 
model=myModelB reRankDocs=100}&fl=id,score,[interleaving]
+
+The output XML will include the model picked for each search result, 
resembling the output shown here:
+
+[source,json]
+----
+{
+  "responseHeader":{
+    "status":0,
+    "QTime":0,
+    "params":{
+      "q":"test",
+      "fl":"id,score,[interleaving]",
+      "rq":"{!ltr model=myModelA model=myModelB reRankDocs=100}"}},
+  "response":{"numFound":2,"start":0,"maxScore":1.0005897,"docs":[
+      {
+        "id":"GB18030TEST",
+        "score":1.0005897,
+        "[interleaving]":"myModelB"},
+      {
+        "id":"UTF8TEST",
+        "score":0.79656565,
+        "[interleaving]":"myModelA"}]
+  }}
+----
+
+=== Running a Rerank Query Interleaving a model with the original ranking
+When approaching Search Quality Evaluation with interleaving it may be useful 
to compare a model with the original ranking. 
+To rerank the results of a query, interleaving a model with the original 
ranking, add the `rq` parameter to your search, with a model in input and 
activating the original ranking interleaving, for example:

Review comment:
       ```suggestion
   To rerank the results of a query, interleaving a model with the original 
ranking, add the `rq` parameter to your search, passing the special inbuilt 
`_OriginalRanking_` model identifier as one model and your comparison model as 
the other model, for example:
   ```

##########
File path: solr/solr-ref-guide/src/learning-to-rank.adoc
##########
@@ -779,3 +869,7 @@ The feature store and the model store are both 
<<managed-resources.adoc#managed-
 * "Learning to Rank in Solr" presentation at Lucene/Solr Revolution 2015 in 
Austin:
 ** Slides: 
http://www.slideshare.net/lucidworks/learning-to-rank-in-solr-presented-by-michael-nilsson-diego-ceccarelli-bloomberg-lp
 ** Video: https://www.youtube.com/watch?v=M7BKwJoh96s
+

Review comment:
       We've got _"... Contributions for further models, features and 
normalizers are welcome. ..."_ above, any thoughts on adding "interleaving 
algorithms" to that list?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] cpoerschke commented on a change in pull request #1571: SOLR-14560: Interleaving for Learning To Rank

Reply via email to