Re: Filtering before a vector search.

2022-01-19 Thread Julie Tibshirani
+1 from me too, this will be a really helpful feature. I've done some background research and found a couple aspects that are tricky. If the filter only matches a small percentage of documents, HNSW can quickly degrade to a brute-force scan. With live docs this isn't a big problem, because our

Re: Filtering before a vector search.

2022-01-19 Thread Joel Bernstein
https://issues.apache.org/jira/browse/LUCENE-10382 Joel Bernstein http://joelsolr.blogspot.com/ On Wed, Jan 19, 2022 at 2:59 PM Joel Bernstein wrote: > Ok, I can create the jira. > > > > Joel Bernstein > http://joelsolr.blogspot.com/ > > > On Wed, Jan 19, 2022 at 2:49 PM Michael Sokolov >

Re: Filtering before a vector search.

2022-01-19 Thread Joel Bernstein
Ok, I can create the jira. Joel Bernstein http://joelsolr.blogspot.com/ On Wed, Jan 19, 2022 at 2:49 PM Michael Sokolov wrote: > +1 we should extend the functionality to support any Bits, not just > liveDocs; we need to propose an API. The implementation should not be > too hard - we need

Re: Filtering before a vector search.

2022-01-19 Thread Michael Sokolov
+1 we should extend the functionality to support any Bits, not just liveDocs; we need to propose an API. The implementation should not be too hard - we need to intersect the user-supplied Bits with liveDocs and use that to filter. On Wed, Jan 19, 2022 at 1:42 PM Joel Bernstein wrote: > > Hi, > >

Filtering before a vector search.

2022-01-19 Thread Joel Bernstein
Hi, Thanks for all the work on the vector search! I was wondering if there was a way using KnnVectorQuery to filter the docs this query looks at. Right now the searchLeaf method passes in the liveDocs to LeafReader.searchNearestVectors, but there appears to be no way to have the KnnVectorQuery