Re: Error Doc id doesn't match the query in vector searches

Varun Thacker Thu, 16 Jan 2025 18:49:51 -0800

I misspoke, for regular search KnnFloatVectorQuery is the Query object
before the rewrite. After the
rewrite it's AbstractKnnVectorQuery$DocAndScoreQuery


And then when Solr asks for the score the same Query object is passed to
the rewrite and becomes a AbstractKnnVectorQuery$DocAndScoreQuery

I'll try looking with some fresh eyes tomorrow

On Thu, Jan 16, 2025 at 6:00 PM Varun Thacker <[email protected]> wrote:

> I'll have to recreate my setup again since I tried re-building solr
> without some PRs and it wiped everything out(my mistake!)
>
> I was able to get the query Solr sends for search KnnFloatVectorQuery vs
> what it uses for getting the
> score {AbstractKnnVectorQuery$DocAndScoreQuery. This might give some
> breadcrumbs to Mike while I try to look into it more tomorrow
>
>
> query = {KnnFloatVectorQuery@10048}
> "KnnFloatVectorQuery:value[0.1234,...][160000]"
>  target = {float[768]@10064} [...
>  field = "value"
>  k = 160000
>  filter = null
>  isDeprecatedRewriteMethodOverridden = false
>  CLASS_NAME_HASH = 1536329572
>
>  query = {AbstractKnnVectorQuery$DocAndScoreQuery@10074}
> "DocAndScore[160000]"
>  k = 160000
>  docs = {int[160000]@10084} [... more]
>  scores = {float[160000]@10085} [...]
>  contextIdentity = {Object@10087}
>  isDeprecatedRewriteMethodOverridden = false
>  CLASS_NAME_HASH = 1706435309
>
> On Thu, Jan 16, 2025 at 5:28 PM Varun Thacker <[email protected]> wrote:
>
>> I have an index where I can repro it with 100% success. Let me look into
>> what's causing it and create a Solr Jira
>>
>> On Mon, Oct 21, 2024 at 11:11 AM Michael Sokolov <[email protected]>
>> wrote:
>>
>>> I think this might be a better question for solr-user@? EG I don't
>>> understand how Solr decides which Query to send to populateScores --
>>> is it the same one that was used to generate the matches in topDocs?
>>> It seems as if it should be, but then this error shouldn't happen ...
>>> I wonder if you can print out the queries sent to search() and to
>>> populateScores()?
>>>
>>> On Thu, Oct 17, 2024 at 5:29 AM Moll, Dr. Andreas <[email protected]>
>>> wrote:
>>> >
>>> > Hi,
>>> >
>>> > we are currently testing Solr 9.7 and experiencing an error we have
>>> not seen before with SolR 9.6.1 and we think the problem might occur in the
>>> underlying lucene code basis:
>>> >
>>> > ERROR o.a.s.h.RequestHandlerBase Server exception =>
>>> > at
>>> org.apache.lucene.search.TopFieldCollector.populateScores(TopFieldCollector.java:478)
>>> > java.lang.IllegalArgumentException: Doc id 48567944 doesn't match the
>>> query
>>> > at
>>> org.apache.lucene.search.TopFieldCollector.populateScores(TopFieldCollector.java:478)
>>> ~[?:?]
>>> > at
>>> org.apache.solr.search.SolrIndexSearcher.populateScoresIfNeeded(SolrIndexSearcher.java:1766)
>>> ~[?:?]
>>> > at
>>> org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1955)
>>> ~[?:?]
>>> > at
>>> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1729)
>>> ~[?:?]
>>> > at
>>> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:726)
>>> ~[?:?]
>>> > at
>>> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:721)
>>> ~[?:?]
>>> > at
>>> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1690)
>>> ~[?:?]
>>> > at
>>> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:432)
>>> ~[?:?]
>>> > at
>>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:456)
>>> ~[?:?]
>>> > at
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:226)
>>> ~[?:?]
>>> >
>>> > We index the embeddings as nested fields and can reproduce the error
>>> with the following code:
>>> >
>>> > We are able to reproduce the error using the same query. It seems to
>>> occur in approximately 5% of all vector queries.
>>> > We have one server running Solr 9.7 and three servers running Solr
>>> 9.6.1, all working on the same frozen index.
>>> > Only the Solr 9.7 server encounters the issue. We can rule out Java 21
>>> and the corresponding optimizations or the new multithreading parameter as
>>> the root cause of the problem.
>>> > The index contains the document referenced in the error message.
>>> >
>>> >
>>> > String q = "{!knn f=vector topK=20}[0.031046804...";
>>> > SolrQuery sq = new SolrQuery("{!cache=false}" + q);
>>> > sq.addField("score"); // no error without the score field
>>> > sq.setRows(14); // Defect document must be included in result
>>> > sq.setSort("ID", ORDER.asc); // Order is not important, but ID is. No
>>> error e.g. with score
>>> > final QueryRequest r = new QueryRequest(sq, METHOD.POST);
>>> > SolrClient solrClient = SolRConnector.createServer(server);
>>> > QueryResponse response = r.process(solrClient);
>>> >
>>> > Is there any additional information we can provide to help resolve
>>> this error?
>>> >
>>> > Best regards
>>> >
>>> > Andreas Moll
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [email protected]
>>> For additional commands, e-mail: [email protected]
>>>
>>>

Re: Error Doc id doesn't match the query in vector searches

Reply via email to