Hi Michael, that is fully-functional in the sense that Lucene will
build an HNSW graph for a vector-valued field and you can then use the
VectorReader.search method to do KNN-based search. Next steps may
include some integration with lexical, inverted-index type search so
that you can retrieve N-closest constrained by other constraints.
Today you can approximate that by oversampling and filtering. There is
also interest in pursuing other KNN search algorithms, and we have
been working to make sure the VectorFormat API (might still get
renamed due to confusion with other kinds of vectors existing in
Lucene) can support alternative KNN implementations.

On Wed, May 19, 2021 at 12:22 PM Michael Wechner
<michael.wech...@wyona.com> wrote:
>
> Hi Alex
>
> Just to make sure I understand better what the additions are about
>
> Am 21.04.21 um 17:21 schrieb Alex K:
> > There were a couple additions recently merged into lucene but not yet
> > released:
> > - A first-class vector codec
>
> do you mean the classes inside
>
> https://github.com/apache/lucene/tree/main/lucene/core/src/java/org/apache/lucene/codecs/lucene90
>
> and in particular
>
> Lucene90HnswVectorFormat.java  Lucene90HnswVectorReader.java
> Lucene90HnswVectorWriter.java
>
> ?
>
> > - An implementation of HNSW for approximate nearest neighbor search
>
> the HNSW implementation at
>
> https://github.com/apache/lucene/tree/main/lucene/core/src/java/org/apache/lucene/util/hnsw
>
> is similar to
>
> https://opendistro.github.io/for-elasticsearch/blog/odfe-updates/2020/04/Building-k-Nearest-Neighbor-(k-NN)-Similarity-Search-Engine-with-Elasticsearch/
>
> ?
> >
> > They are however available in the snapshot releases. I started on a small
> > project to get the HNSW implementation into the ann-benchmarks project, but
> > had to set it aside.
>
> Is there still something missing? Or what would be the next steps?
>
> Thanks
>
> Michael
>
>
> >   Here's the code:
> > https://github.com/alexklibisz/ann-benchmarks-lucene. There are some test
> > suites that index and search Glove vectors. My first impression was that
> > indexing seems surprisingly slow, but it's entirely possible I'm doing
> > something wrong.
> >
> > On Wed, Apr 21, 2021 at 9:31 AM Michael Wechner <michael.wech...@wyona.com>
> > wrote:
> >
> >> Hi
> >>
> >> I recently found the following articles re Lucene/Solr and BERT
> >>
> >> https://dmitry-kan.medium.com/neural-search-with-bert-and-solr-ea5ead060b28
> >>
> >> https://medium.com/swlh/fun-with-apache-lucene-and-bert-embeddings-c2c496baa559
> >>
> >> and would like to ask whether there might be more recent developments
> >> within the Lucene/Solr community re BERT integration?
> >>
> >> Also how these developments relate to
> >>
> >> https://sbert.net/
> >>
> >> ?
> >>
> >> Thanks very much for your insights!
> >>
> >> Michael
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: java-user-h...@lucene.apache.org
> >>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to