Also as a workaround, I can confirm that setting indexSearcherExecutor to 0
from the Solr side as mentioned in
https://issues.apache.org/jira/browse/SOLR-17642 mitigates the problem for
now.

On Thu, Jan 30, 2025 at 10:40 AM Benjamin Trent <ben.w.tr...@gmail.com>
wrote:

> Yes, FYI, we found the bug in the kNN query
> https://github.com/apache/lucene/issues/14180
>
> Basically, threads sharing information back for graph early termination
> can lead to inconsistency. We should fix this in Lucene. Though I do not
> know the timeline or the simplicity.
>
> Thank you Dr. Andreas Moll for bringing this to our attention!
>
> On Thu, Jan 30, 2025 at 1:26 PM Varun Thacker <va...@vthacker.in> wrote:
>
>> Benjamin - I think this has to do with Solr 9.7+ using thread executor's
>> for searching.
>>
>> I can take Solr 9.7 or Solr 9.8 and just undo this one line in
>> SolrIndexSearcher
>> <https://github.com/apache/solr/blob/7af2ad56753bf75b8391639233dcc8d465767de9/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L385>
>>  and
>> the query doesn't fail
>>
>> -    super(wrapReader(core, r),
>> core.getCoreContainer().getIndexSearcherExecutor());
>> +    super(wrapReader(core, r));
>>
>> On Thu, Jan 23, 2025 at 6:49 AM Benjamin Trent <ben.w.tr...@gmail.com>
>> wrote:
>>
>>> From the vector search side of things, nothing immediately pops up as a
>>> cause. https://lucene.apache.org/core/9_11_0/changes/Changes.html
>>>
>>> The given query is just a regular kNN query. So, its rewrite should
>>> behave similarly as it did in 9.10.
>>>
>>> One significant change for kNN search behavior did happen in 9.10:
>>> https://github.com/apache/lucene/pull/12962 But since this issue
>>> doesn't happen in 9.10, I am at a loss.
>>>
>>> Since `knn` rewrites itself to `KnnScoreDoc` object, It's surprising
>>> that the result set should change between collecting and scoring.
>>>
>>> I wonder if Solr adjusted due to this deprecation or started using
>>> collector managers and inadvertently tripped over a bug or something?
>>>
>>> Or, something was added in Apache Lucene 9.11 where the same knn query
>>> over the same index could result in a different set of top-k docs. Though,
>>> I would have thought the main candidate there would be:
>>> https://github.com/apache/lucene/pull/12962 (in lucene 9.10).
>>>
>>> On Thu, Jan 23, 2025 at 3:46 AM Moll, Dr. Andreas <m...@juris.de.invalid>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I want to inform you about a behavior change in SolR 9.6 (Lucene 9.10)
>>>> vs. SolR 9.7 (Lucene 9.11) for vector searches.
>>>>
>>>> We heavily rely on vector searches for embeddings in combination with
>>>> filter queries on the parent documents.
>>>>
>>>> Our queries in general looked like this:
>>>>
>>>> select?q={ knn f=vector topK=2048}[...]
>>>>
>>>> rows=100
>>>>
>>>> fq={ child of='childtype:root'}…
>>>> start=0
>>>>
>>>> sort=score desc,ID desc
>>>>
>>>> With SolR 9.7 and higher, this results in ~10% of the queries producing
>>>> the following error:
>>>>
>>>> java.lang.IllegalArgumentException: Doc id 27227879 doesn't match the
>>>> query
>>>>
>>>>         at
>>>> org.apache.lucene.search.TopFieldCollector.populateScores(TopFieldCollector.java:478)
>>>> ~[?:?]
>>>>
>>>>         at
>>>> org.apache.solr.search.SolrIndexSearcher.populateScoresIfNeeded(SolrIndexSearcher.java:1812)
>>>> ~[?:?]
>>>>
>>>>         at
>>>> org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:2001)
>>>> ~[?:?]
>>>>
>>>>         at
>>>> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1775)
>>>> ~[?:?]
>>>>
>>>>         at
>>>> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:772)
>>>> ~[?:?]
>>>>
>>>>         at
>>>> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:767)
>>>> ~[?:?]
>>>>
>>>> After several days of debugging, I confirmed that the number of errors
>>>> correlates to the topK value:
>>>>
>>>>    - k = 8 -> 44 errors
>>>>    - k = 2048 -> 17 errors
>>>>    - k = 16384 -> 1 error
>>>>
>>>> I found a workaround for the issue by modifying the sort parameter to:
>>>>
>>>> sort=score desc
>>>>
>>>> With this change, our queries work like a charm again. The initial
>>>> thought of adding the ID desc sorting was to get more reproducible
>>>> results, but it is not strictly necessary for us.
>>>>
>>>> Could you clarify if this change in SolR/Lucene was intended? If so,
>>>> perhaps you want to add documentation on vector queries that adding an
>>>> additional sorting might cause errors.
>>>>
>>>> Best regards,
>>>> Dr. Andreas Moll
>>>>
>>>>
>>>>
>>>

Reply via email to