[
https://issues.apache.org/jira/browse/SOLR-17296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847076#comment-17847076
]
Chris M. Hostetter commented on SOLR-17296:
-------------------------------------------
I would argue that the "old" problem of NPE is the lesser of two evils compared
to the "new" problem of silently failing to return fields – i'd much rather
people who try to enable debugging get an error if it doesn't work then maybe
not noticed that their response is missing fields they expected (when it's
{{fl=*}} and you have a lot of stored fields it's really obvious something is
wrong – if you use {{fl=id,special_rare_field}} you might not realize that
{{special_rare_field}} is missing from docs that should actually have it.
I also think that, in general, we should not *CHANGE* the core logic of how
requests are processed (in this case to use single pass) because a debugging
feature is enabled. If getting accurate debugging of reranked results requires
using a single pass, let's *_inform_* our users of that, and let them
*c{_}hoose{_}* to do it.
I recommend we immediately:
* revert the logic added by SOLR-16931
* Open a new SOLR-XXX jira for long term consideration of how to redesign
debug component to get the multistage info needed for explaining reranked
results
* harden {{ReRankScaler.explain}} so that if the dat it expects isn't
available, it wraps the data it *DOES* have with a {{"0.0 == scaled reranking
not available (consider using distrib.singlePass - see SOLR-XXX)"}}
> rerank w/scaling (still) broken when using debug to get explain info
> --------------------------------------------------------------------
>
> Key: SOLR-17296
> URL: https://issues.apache.org/jira/browse/SOLR-17296
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Affects Versions: 9.4
> Reporter: Chris M. Hostetter
> Priority: Major
> Attachments: SOLR-17296.test.patch
>
>
> The changes made in SOLR-16931 (9.4) attempted to work around problems that
> existed when attempting to enable degugging (to get score explanations) in
> combination with using {{reRankScale}} ...
> {quote}The reason for this is that in order to do proper explain for
> minMaxScaling you need to know the min and max score in the result set. This
> piece of state is maintained in the ReRankScaler itself which is inside of
> the ReRankQuery. But for this information to be populated the query must
> first be run. In distributed mode, explain is called in the second pass when
> the ids query is run so the state needed for the explain is not populated. ...
> {quote}
> However, the solution attempted was incomplete and failed to account for
> multiple factors...
> {quote}... The PR attached to this addresses this problem by doing a single
> pass distributed query if debugQuery is turned on and if reRank score scaling
> is applied. I'll add a distributed test for this as well.
> This change is very limited in scope because the single pass distributed is
> only switched on in the very specific case when debugQuery=true and
> reRankScaling is on.
> {quote} * NPEs are still possible...
> ** Instead of checking for {{ResponseBuilder.isDebugResults()}} (which is
> what triggers explain logic) the new code only checked for specific debug
> request param combinations:
> *
> **
> *** {{debuQuery=true}} (a legacy option intended only for backcompat)
> *** {{debug=true}} (intended as an alias for {{debug=all}}
> ** It did not check for either of these options, which if used will still
> trigger an NPE...
> *** {{debug=results}} (which actually dictates the value of
> {{ResponseBuilder.isDebugResults()}}
> *** {{debug=all}} (a short hand for setting all debug options)
> * the attempt to force a single pass query didn't modify the correct variable
> ** The new code modified a conditional based on a {{boolean
> distribSinglePass}} for setting {{sreq.purpose}} and
> {{rb.onePassDistributedQuery}}
> ** But it did not modify the value of the {{boolean distribSinglePass}}
> itself - meaning other logic that uses that variable in that method still
> assumes multiple passes will be used.
> ** In particular, these means that even though a single pass is used for
> both {{PURPOSE_GET_TOP_IDS}} and {{PURPOSE_GET_FIELDS}} the full {{"fl"}}
> requested by the user is not propagated as part of this request
> *** Only the uniqueKey and any sot fields are ultimately returned to the
> user.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]