[
https://issues.apache.org/jira/browse/SOLR-17296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris M. Hostetter updated SOLR-17296:
--------------------------------------
Attachment: SOLR-17296.test.patch
Status: Open (was: Open)
Both of these problems are trivial to reproduce using the cloud example ....
Data setup and baseline rerank request (no debugging)
{noformat}
$ ./solr/packaging/build/dev/bin/solr start -e cloud -noprompt
...
$ curl -X POST -H 'Content-Type: application/csv'
'http://localhost:8983/solr/gettingstarted/update?commit=true' --data-binary
@solr/example/exampledocs/books.csv
...
$ curl http://localhost:8983/solr/gettingstarted/select --form-string 'q=*:*'
--form-string 'rows=1' --form-string 'rq={!rerank reRankQuery=$rr
reRankScale=0-1 reRankWeight=0.3 reRankDocs=1000 reRankOperator=add}'
--form-string 'rr=inStock=true'
{
"responseHeader":{
"zkConnected":true,
"status":0,
"QTime":26,
"params":{
"rr":"inStock=true",
"q":"*:*",
"rows":"1",
"rq":"{!rerank reRankQuery=$rr reRankScale=0-1 reRankWeight=0.3
reRankDocs=1000 reRankOperator=add}"
}
},
"response":{
"numFound":10,
"start":0,
"maxScore":1.0,
"numFoundExact":true,
"docs":[{
"id":"0812521390",
"cat":["book"],
"name":["The Black Company"],
"price":[6.99],
"inStock":[false],
"author":["Glen Cook"],
"series_t":"The Chronicles of The Black Company",
"sequence_i":1,
"genre_s":"fantasy",
"_version_":1799231031515021312
}]
}
}
{noformat}
Try using {{debug=true}} and now none of the stored fields are returned...
{noformat}
$ curl http://localhost:8983/solr/gettingstarted/select --form-string 'q=*:*'
--form-string 'rows=1' --form-string 'rq={!rerank reRankQuery=$rr
reRankScale=0-1 reRankWeight=0.3 reRankDocs=1000 reRankOperator=add}'
--form-string 'rr=inStock=true' --form-string 'debug=true'
{
"responseHeader":{
"zkConnected":true,
"status":0,
"QTime":13,
"params":{
"rr":"inStock=true",
"q":"*:*",
"debug":"true",
"rows":"1",
"rq":"{!rerank reRankQuery=$rr reRankScale=0-1 reRankWeight=0.3
reRankDocs=1000 reRankOperator=add}"
}
},
"response":{
"numFound":10,
"start":0,
"maxScore":1.0,
"numFoundExact":true,
"docs":[{
"id":"0812521390"
}]
},
"debug":{
"track":{
...
{noformat}
Use {{debug=results}} (or {{debug=all}} to trigger NPE...
{noformat}
$ curl http://localhost:8983/solr/gettingstarted/select --form-string 'q=*:*'
--form-string 'rows=1' --form-string 'rq={!rerank reRankQuery=$rr
reRankScale=0-1 reRankWeight=0.3 reRankDocs=1000 reRankOperator=add}'
--form-string 'rr=inStock=true' --form-string 'debug=results'
{
"responseHeader":{
"zkConnected":true,
"status":500,
"QTime":37,
"params":{
"rr":"inStock=true",
"q":"*:*",
"debug":"results",
"rows":"1",
"rq":"{!rerank reRankQuery=$rr reRankScale=0-1 reRankWeight=0.3
reRankDocs=1000 reRankOperator=add}"
}
},
"error":{
"metadata":["error-class","org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException","root-error-class","org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException"],
"msg":"Error from server at
http://localhost:7574/solr/gettingstarted_shard1_replica_n2/select:
java.lang.NullPointerException\n\tat
org.apache.solr.search.ReRankScaler.explain(ReRankScaler.java:348)...
{noformat}
I'm attaching a very basic test patch that demonstrates both of these problems.
> rerank w/scaling (still) broken when using debug to get explain info
> --------------------------------------------------------------------
>
> Key: SOLR-17296
> URL: https://issues.apache.org/jira/browse/SOLR-17296
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Affects Versions: 9.4
> Reporter: Chris M. Hostetter
> Priority: Major
> Attachments: SOLR-17296.test.patch
>
>
> The changes made in SOLR-16931 (9.4) attempted to work around problems that
> existed when attempting to enable degugging (to get score explanations) in
> combination with using {{reRankScale}} ...
> {quote}The reason for this is that in order to do proper explain for
> minMaxScaling you need to know the min and max score in the result set. This
> piece of state is maintained in the ReRankScaler itself which is inside of
> the ReRankQuery. But for this information to be populated the query must
> first be run. In distributed mode, explain is called in the second pass when
> the ids query is run so the state needed for the explain is not populated. ...
> {quote}
> However, the solution attempted was incomplete and failed to account for
> multiple factors...
> {quote}... The PR attached to this addresses this problem by doing a single
> pass distributed query if debugQuery is turned on and if reRank score scaling
> is applied. I'll add a distributed test for this as well.
> This change is very limited in scope because the single pass distributed is
> only switched on in the very specific case when debugQuery=true and
> reRankScaling is on.
> {quote} * NPEs are still possible...
> ** Instead of checking for {{ResponseBuilder.isDebugResults()}} (which is
> what triggers explain logic) the new code only checked for specific debug
> request param combinations:
> *
> **
> *** {{debuQuery=true}} (a legacy option intended only for backcompat)
> *** {{debug=true}} (intended as an alias for {{debug=all}}
> ** It did not check for either of these options, which if used will still
> trigger an NPE...
> *** {{debug=results}} (which actually dictates the value of
> {{ResponseBuilder.isDebugResults()}}
> *** {{debug=all}} (a short hand for setting all debug options)
> * the attempt to force a single pass query didn't modify the correct variable
> ** The new code modified a conditional based on a {{boolean
> distribSinglePass}} for setting {{sreq.purpose}} and
> {{rb.onePassDistributedQuery}}
> ** But it did not modify the value of the {{boolean distribSinglePass}}
> itself - meaning other logic that uses that variable in that method still
> assumes multiple passes will be used.
> ** In particular, these means that even though a single pass is used for
> both {{PURPOSE_GET_TOP_IDS}} and {{PURPOSE_GET_FIELDS}} the full {{"fl"}}
> requested by the user is not propagated as part of this request
> *** Only the uniqueKey and any sot fields are ultimately returned to the
> user.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]