Lastly, looks like there were some changes to the collector executor in https://github.com/apache/solr/commit/7405bb19fa424ec79c6daaafd986670dc54d7dfe that fixes the issue. I cannot repro the issue anymore
So we don't need to create a solr jira after all On Fri, Jan 17, 2025 at 11:44 AM Varun Thacker <va...@vthacker.in> wrote: > I was able to narrow down to > https://github.com/apache/solr/commit/cfec121bab2ecfc4c06e20a5533596025ae63d98 > that causes this issue. > > Without that change the bug doesn't repro > > On Thu, Jan 16, 2025 at 6:49 PM Varun Thacker <va...@vthacker.in> wrote: > >> I misspoke, for regular search KnnFloatVectorQuery is the Query object >> before the rewrite. After the >> rewrite it's AbstractKnnVectorQuery$DocAndScoreQuery >> >> And then when Solr asks for the score the same Query object is passed to >> the rewrite and becomes a AbstractKnnVectorQuery$DocAndScoreQuery >> >> I'll try looking with some fresh eyes tomorrow >> >> On Thu, Jan 16, 2025 at 6:00 PM Varun Thacker <va...@vthacker.in> wrote: >> >>> I'll have to recreate my setup again since I tried re-building solr >>> without some PRs and it wiped everything out(my mistake!) >>> >>> I was able to get the query Solr sends for search KnnFloatVectorQuery vs >>> what it uses for getting the >>> score {AbstractKnnVectorQuery$DocAndScoreQuery. This might give some >>> breadcrumbs to Mike while I try to look into it more tomorrow >>> >>> >>> query = {KnnFloatVectorQuery@10048} >>> "KnnFloatVectorQuery:value[0.1234,...][160000]" >>> target = {float[768]@10064} [... >>> field = "value" >>> k = 160000 >>> filter = null >>> isDeprecatedRewriteMethodOverridden = false >>> CLASS_NAME_HASH = 1536329572 >>> >>> query = {AbstractKnnVectorQuery$DocAndScoreQuery@10074} >>> "DocAndScore[160000]" >>> k = 160000 >>> docs = {int[160000]@10084} [... more] >>> scores = {float[160000]@10085} [...] >>> contextIdentity = {Object@10087} >>> isDeprecatedRewriteMethodOverridden = false >>> CLASS_NAME_HASH = 1706435309 >>> >>> On Thu, Jan 16, 2025 at 5:28 PM Varun Thacker <va...@vthacker.in> wrote: >>> >>>> I have an index where I can repro it with 100% success. Let me look >>>> into what's causing it and create a Solr Jira >>>> >>>> On Mon, Oct 21, 2024 at 11:11 AM Michael Sokolov <msoko...@gmail.com> >>>> wrote: >>>> >>>>> I think this might be a better question for solr-user@? EG I don't >>>>> understand how Solr decides which Query to send to populateScores -- >>>>> is it the same one that was used to generate the matches in topDocs? >>>>> It seems as if it should be, but then this error shouldn't happen ... >>>>> I wonder if you can print out the queries sent to search() and to >>>>> populateScores()? >>>>> >>>>> On Thu, Oct 17, 2024 at 5:29 AM Moll, Dr. Andreas >>>>> <m...@juris.de.invalid> wrote: >>>>> > >>>>> > Hi, >>>>> > >>>>> > we are currently testing Solr 9.7 and experiencing an error we have >>>>> not seen before with SolR 9.6.1 and we think the problem might occur in >>>>> the >>>>> underlying lucene code basis: >>>>> > >>>>> > ERROR o.a.s.h.RequestHandlerBase Server exception => >>>>> > at >>>>> org.apache.lucene.search.TopFieldCollector.populateScores(TopFieldCollector.java:478) >>>>> > java.lang.IllegalArgumentException: Doc id 48567944 doesn't match >>>>> the query >>>>> > at >>>>> org.apache.lucene.search.TopFieldCollector.populateScores(TopFieldCollector.java:478) >>>>> ~[?:?] >>>>> > at >>>>> org.apache.solr.search.SolrIndexSearcher.populateScoresIfNeeded(SolrIndexSearcher.java:1766) >>>>> ~[?:?] >>>>> > at >>>>> org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1955) >>>>> ~[?:?] >>>>> > at >>>>> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1729) >>>>> ~[?:?] >>>>> > at >>>>> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:726) >>>>> ~[?:?] >>>>> > at >>>>> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:721) >>>>> ~[?:?] >>>>> > at >>>>> org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1690) >>>>> ~[?:?] >>>>> > at >>>>> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:432) >>>>> ~[?:?] >>>>> > at >>>>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:456) >>>>> ~[?:?] >>>>> > at >>>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:226) >>>>> ~[?:?] >>>>> > >>>>> > We index the embeddings as nested fields and can reproduce the error >>>>> with the following code: >>>>> > >>>>> > We are able to reproduce the error using the same query. It seems to >>>>> occur in approximately 5% of all vector queries. >>>>> > We have one server running Solr 9.7 and three servers running Solr >>>>> 9.6.1, all working on the same frozen index. >>>>> > Only the Solr 9.7 server encounters the issue. We can rule out Java >>>>> 21 and the corresponding optimizations or the new multithreading parameter >>>>> as the root cause of the problem. >>>>> > The index contains the document referenced in the error message. >>>>> > >>>>> > >>>>> > String q = "{!knn f=vector topK=20}[0.031046804..."; >>>>> > SolrQuery sq = new SolrQuery("{!cache=false}" + q); >>>>> > sq.addField("score"); // no error without the score field >>>>> > sq.setRows(14); // Defect document must be included in result >>>>> > sq.setSort("ID", ORDER.asc); // Order is not important, but ID is. >>>>> No error e.g. with score >>>>> > final QueryRequest r = new QueryRequest(sq, METHOD.POST); >>>>> > SolrClient solrClient = SolRConnector.createServer(server); >>>>> > QueryResponse response = r.process(solrClient); >>>>> > >>>>> > Is there any additional information we can provide to help resolve >>>>> this error? >>>>> > >>>>> > Best regards >>>>> > >>>>> > Andreas Moll >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>>> >>>>>