Indeed, the load order can influence Lucene's approximate nearest neighbor
search results.

If your two indexes load data sequentially and in the same order, then I
believe that you would get the same results. But we consider this an
implementation detail rather than a guarantee that Lucene should have.

On Thu, Sep 12, 2024 at 7:03 PM Marc Davenport
<madavenp...@cargurus.com.invalid> wrote:

> Hello,
> I've been working on this personalization project using KNN queries and I
> have a couple questions but one is more pressing for me than the others.
>
> 1) Inconsistency between index instances:
> All of the same documents are loaded into different indexes. They may be
> loaded in different order, but the set of documents will be consistent when
> done.  I'm finding that when I ask for the 1000 knn documents I sometimes
> get inconsistent results between each index.  Results are always consistent
> from an individual instance.   If we assume I haven't made a mistake and
> the universe of documents are the same in all instances, can the document
> load order have an effect on what is considered the nearest neighbors?
>  What if I am processing updates to the index at different rates on each
> machine, but the end data is all the same?
>
> Thank you,
> Marc
>


-- 
Adrien

Reply via email to