[
https://issues.apache.org/jira/browse/SOLR-9377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408746#comment-15408746
]
Hoss Man commented on SOLR-9377:
--------------------------------
I spent some time today looking into this and how to "fix" it.
My initial impression was that {{SubQueryAugmenter(Factory)}} was doing more
work then it needed to. It currently creates an
{{EmbeddedSolrServer(req.getCore())}} and operates at the (solrj)
"{{QueryRequest}}/{{QueryResponse}}" level to execute the "subquery", pulling
back a {{SolrDocumentList}} to populate it's own custom {{Result extends
ResultContext}} for each document.
My thinking was, we could bypass {{EmbeddedSolrServer}} by just asking the
{{SolrCore}} to execute a {{SolrQueryRequest}} we create directly around the
realtime searcher (see below), and then use the {{DocList}} from the resulting
{{SolrQueryResponse}} along w/ the other pieces we've accumulated to create a
regular old {{BasicResultContext}} for each document. ala...
{code}
private static class SubQuerySolrQueryRequest extends SolrQueryRequestBase {
// we'd pass the ResultContext.getSearcher() here, so these queries would
have access to the
// realtime seracher if we're used in an RTG request...
public SubQuerySolrQueryRequest(SolrCore core, SolrParams params,
RefCounted<SolrIndexSearcher> searcherHolder) {
super(core, params);
this.searcherHolder = searcherHolder;
}
}
{code}
The problem with this idea is the {{fromIndex}} param that this transformer
supports. It leans heavily on the existing code in
{{EmbeddedSolrServer.request(...)}} method logic to figure out the correct core
to use. We'd have to refactor/unwind/duplicate some of that in order to
operate directly at the "go get a core by name, now execute a
SubQuerySolrQueryRequest against it" layer of the abstraction.
----
Ultimately i'm starting to wonder if the current behavior is actually the
best/correct behavior?
In the test code that lead me to file this bug (see TestRandomFlRTGCloud &
SubQueryValidator) the "problems" that arise are because the validation code
for the {{\[subquery\]}} I'm using expects (at least) the original document to
match the subquery against one of it's field values -- and when it's read from
the tlog it doesn't because the realtime searcher is not used.
Perhaps that's ok? Perhaps the only thing that's really important is that
_values_ from the doc used when building the subquery are accurate, and come
from the tlog, and the query itself can (or perhaps _MUST_?) still be run
against the same currently open searcher as if the user ran that subquery
themselves?
----
[~mkhludnev]: do you have any opinions on what we should consider the "correct"
behavior in this situation?
> [subquery] augmenter doesn't work with RTG on uncommited docs
> -------------------------------------------------------------
>
> Key: SOLR-9377
> URL: https://issues.apache.org/jira/browse/SOLR-9377
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Hoss Man
>
> Spinning off from SOLR-9314...
> The {{[subquery]}} DocTransformer can give unexpected results when used with
> RTG on uncommitted docs.
> Test code demonstrating the problem is being added to TestRandomFlRTGCloud as
> part of SOLR-9314, but it's being disabled for now due to this current bug.
> As noted in that jira...
> {quote}
> The subquery validation tries to search for all docs with teh same field
> value as the current doc, asserting that there is always at least 1 match –
> but this assertion currently fails ... by the looks of it this is (obviously)
> because it doesn't know to to use the realtime seracher re-opened by the RTG
> ... but based on how the SubQueryAugmenter is implemented, i'm not even
> certain how to go about it.
> {quote}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]