We have seen this, and when we tested against Solr 9.8 (currently being released), the error went away. It turned out to be a weird thing about the search executor, but we couldn't necessarily narrow down why it happened.
Anyway, please test out with the new Solr 9.8.0 release when it is available within the next day, and let us know if that fixes the problem for you. - Houston On Tue, Jan 21, 2025 at 11:50 AM Alessandro Benedetti <a.benede...@sease.io> wrote: > I'm not sure the KNN search supports multiple sort conditions. > I should do some deep dive in the Lucene code but I don't have time in the > foreseeable short future. > I can imagine that anyway it would only support 're-ranking' the retrieved > topK by the additional score condition, this can not affect how the topK is > retrieved, at best how it's sorted. > > What would be really valuable is if you can reproduce the issue and even > better if the issue can be reproduced using a solr test > (org.apache.solr.search.neural.KnnQParserTest) > Cheers > -------------------------- > *Alessandro Benedetti* > Director @ Sease Ltd. > *Apache Lucene/Solr Committer* > *Apache Solr PMC Member* > > e-mail: a.benede...@sease.io > > > *Sease* - Information Retrieval Applied > Consulting | Training | Open Source > > Website: Sease.io <http://sease.io/> > LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter > <https://twitter.com/seaseltd> | Youtube > <https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github > <https://github.com/seaseltd> > > > On Tue, 21 Jan 2025 at 13:04, Moll, Dr. Andreas <m...@juris.de.invalid> > wrote: > > > Hi, > > I want to inform you about a behavior change in SolR 9.6 (Lucene 9.10) > vs. > > SolR 9.7 (Lucene 9.11) for vector searches. > > We heavily rely on vector searches for embeddings in combination with > > filter queries on the parent documents. > > Our queries in general looked like this: > > select?q={ knn f=vector topK=2048}[...] > > rows=100 > > fq={ child of='childtype:root'}... > > start=0 > > sort=score desc,ID desc > > With SolR 9.7 and higher, this results in ~10% of the queries producing > > the following error: > > java.lang.IllegalArgumentException: Doc id 27227879 doesn't match the > query > > at > > > org.apache.lucene.search.TopFieldCollector.populateScores(TopFieldCollector.java:478) > > ~[?:?] > > at > > > org.apache.solr.search.SolrIndexSearcher.populateScoresIfNeeded(SolrIndexSearcher.java:1812) > > ~[?:?] > > at > > > org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:2001) > > ~[?:?] > > at > > > org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1775) > > ~[?:?] > > at > > > org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:772) > > ~[?:?] > > at > > > org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:767) > > ~[?:?] > > After several days of debugging, I confirmed that the number of errors > > correlates to the topK value: > > > > * k = 8 -> 44 errors > > * k = 2048 -> 17 errors > > * k = 16384 -> 1 error > > I found a workaround for the issue by modifying the sort parameter to: > > sort=score desc > > With this change, our queries work like a charm again. The initial > thought > > of adding the ID desc sorting was to get more reproducible results, but > it > > is not strictly necessary for us. > > Could you clarify if this change in SolR/Lucene was intended? If so, > > perhaps you want to add documentation on vector queries that adding an > > additional sorting might cause errors. > > Best regards, > > Dr. Andreas Moll > > > > >