I was able to narrow down to
https://github.com/apache/solr/commit/cfec121bab2ecfc4c06e20a5533596025ae63d98
that causes this issue.

Without that change the bug doesn't repro

On Thu, Jan 16, 2025 at 6:49 PM Varun Thacker <va...@vthacker.in> wrote:

> I misspoke, for regular search KnnFloatVectorQuery is the Query object
> before the rewrite. After the
> rewrite it's AbstractKnnVectorQuery$DocAndScoreQuery
>
> And then when Solr asks for the score the same Query object is passed to
> the rewrite and becomes a AbstractKnnVectorQuery$DocAndScoreQuery
>
> I'll try looking with some fresh eyes tomorrow
>
> On Thu, Jan 16, 2025 at 6:00 PM Varun Thacker <va...@vthacker.in> wrote:
>
>> I'll have to recreate my setup again since I tried re-building solr
>> without some PRs and it wiped everything out(my mistake!)
>>
>> I was able to get the query Solr sends for search KnnFloatVectorQuery vs
>> what it uses for getting the
>> score {AbstractKnnVectorQuery$DocAndScoreQuery. This might give some
>> breadcrumbs to Mike while I try to look into it more tomorrow
>>
>>
>> query = {KnnFloatVectorQuery@10048}
>> "KnnFloatVectorQuery:value[0.1234,...][160000]"
>>  target = {float[768]@10064} [...
>>  field = "value"
>>  k = 160000
>>  filter = null
>>  isDeprecatedRewriteMethodOverridden = false
>>  CLASS_NAME_HASH = 1536329572
>>
>>  query = {AbstractKnnVectorQuery$DocAndScoreQuery@10074}
>> "DocAndScore[160000]"
>>  k = 160000
>>  docs = {int[160000]@10084} [... more]
>>  scores = {float[160000]@10085} [...]
>>  contextIdentity = {Object@10087}
>>  isDeprecatedRewriteMethodOverridden = false
>>  CLASS_NAME_HASH = 1706435309
>>
>> On Thu, Jan 16, 2025 at 5:28 PM Varun Thacker <va...@vthacker.in> wrote:
>>
>>> I have an index where I can repro it with 100% success. Let me look into
>>> what's causing it and create a Solr Jira
>>>
>>> On Mon, Oct 21, 2024 at 11:11 AM Michael Sokolov <msoko...@gmail.com>
>>> wrote:
>>>
>>>> I think this might be a better question for solr-user@? EG I don't
>>>> understand how Solr decides which Query to send to populateScores --
>>>> is it the same one that was used to generate the matches in topDocs?
>>>> It seems as if it should be, but then this error shouldn't happen ...
>>>> I wonder if you can print out the queries sent to search() and to
>>>> populateScores()?
>>>>
>>>> On Thu, Oct 17, 2024 at 5:29 AM Moll, Dr. Andreas <m...@juris.de.invalid>
>>>> wrote:
>>>> >
>>>> > Hi,
>>>> >
>>>> > we are currently testing Solr 9.7 and experiencing an error we have
>>>> not seen before with SolR 9.6.1 and we think the problem might occur in the
>>>> underlying lucene code basis:
>>>> >
>>>> > ERROR o.a.s.h.RequestHandlerBase Server exception =>
>>>> > at
>>>> org.apache.lucene.search.TopFieldCollector.populateScores(TopFieldCollector.java:478)
>>>> > java.lang.IllegalArgumentException: Doc id 48567944 doesn't match the
>>>> query
>>>> > at
>>>> org.apache.lucene.search.TopFieldCollector.populateScores(TopFieldCollector.java:478)
>>>> ~[?:?]
>>>> > at
>>>> org.apache.solr.search.SolrIndexSearcher.populateScoresIfNeeded(SolrIndexSearcher.java:1766)
>>>> ~[?:?]
>>>> > at
>>>> org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1955)
>>>> ~[?:?]
>>>> > at
>>>> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1729)
>>>> ~[?:?]
>>>> > at
>>>> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:726)
>>>> ~[?:?]
>>>> > at
>>>> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:721)
>>>> ~[?:?]
>>>> > at
>>>> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1690)
>>>> ~[?:?]
>>>> > at
>>>> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:432)
>>>> ~[?:?]
>>>> > at
>>>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:456)
>>>> ~[?:?]
>>>> > at
>>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:226)
>>>> ~[?:?]
>>>> >
>>>> > We index the embeddings as nested fields and can reproduce the error
>>>> with the following code:
>>>> >
>>>> > We are able to reproduce the error using the same query. It seems to
>>>> occur in approximately 5% of all vector queries.
>>>> > We have one server running Solr 9.7 and three servers running Solr
>>>> 9.6.1, all working on the same frozen index.
>>>> > Only the Solr 9.7 server encounters the issue. We can rule out Java
>>>> 21 and the corresponding optimizations or the new multithreading parameter
>>>> as the root cause of the problem.
>>>> > The index contains the document referenced in the error message.
>>>> >
>>>> >
>>>> > String q = "{!knn f=vector topK=20}[0.031046804...";
>>>> > SolrQuery sq = new SolrQuery("{!cache=false}" + q);
>>>> > sq.addField("score"); // no error without the score field
>>>> > sq.setRows(14); // Defect document must be included in result
>>>> > sq.setSort("ID", ORDER.asc); // Order is not important, but ID is. No
>>>> error e.g. with score
>>>> > final QueryRequest r = new QueryRequest(sq, METHOD.POST);
>>>> > SolrClient solrClient = SolRConnector.createServer(server);
>>>> > QueryResponse response = r.process(solrClient);
>>>> >
>>>> > Is there any additional information we can provide to help resolve
>>>> this error?
>>>> >
>>>> > Best regards
>>>> >
>>>> > Andreas Moll
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>>
>>>>

Reply via email to