Hi Michael

Thank you for your explanations!

I am currently trying to implement it, whereas I am learning from the code of

https://github.com/jtibshirani/lucene/blob/hnsw-bench/lucene/core/src/java/org/apache/lucene/search/PythonEntryPoint.java

whereas Julie told me, that the code is a bit out-of-date, but should be updated very soon.

It would be great to have some example code, similar to what is available for Lucene otherwise

https://lucene.apache.org/core/8_8_2/core/overview-summary.html#overview.description

Does this already exist? If not, I could try to create some and contribute it to the documentation.

Thanks

Michael


Am 24.05.21 um 05:22 schrieb Michael Sokolov:
Hi Michael, that is fully-functional in the sense that Lucene will
build an HNSW graph for a vector-valued field and you can then use the
VectorReader.search method to do KNN-based search. Next steps may
include some integration with lexical, inverted-index type search so
that you can retrieve N-closest constrained by other constraints.
Today you can approximate that by oversampling and filtering. There is
also interest in pursuing other KNN search algorithms, and we have
been working to make sure the VectorFormat API (might still get
renamed due to confusion with other kinds of vectors existing in
Lucene) can support alternative KNN implementations.

On Wed, May 19, 2021 at 12:22 PM Michael Wechner
<michael.wech...@wyona.com> wrote:
Hi Alex

Just to make sure I understand better what the additions are about

Am 21.04.21 um 17:21 schrieb Alex K:
There were a couple additions recently merged into lucene but not yet
released:
- A first-class vector codec
do you mean the classes inside

https://github.com/apache/lucene/tree/main/lucene/core/src/java/org/apache/lucene/codecs/lucene90

and in particular

Lucene90HnswVectorFormat.java  Lucene90HnswVectorReader.java
Lucene90HnswVectorWriter.java

?

- An implementation of HNSW for approximate nearest neighbor search
the HNSW implementation at

https://github.com/apache/lucene/tree/main/lucene/core/src/java/org/apache/lucene/util/hnsw

is similar to

https://opendistro.github.io/for-elasticsearch/blog/odfe-updates/2020/04/Building-k-Nearest-Neighbor-(k-NN)-Similarity-Search-Engine-with-Elasticsearch/

?
They are however available in the snapshot releases. I started on a small
project to get the HNSW implementation into the ann-benchmarks project, but
had to set it aside.
Is there still something missing? Or what would be the next steps?

Thanks

Michael


   Here's the code:
https://github.com/alexklibisz/ann-benchmarks-lucene. There are some test
suites that index and search Glove vectors. My first impression was that
indexing seems surprisingly slow, but it's entirely possible I'm doing
something wrong.

On Wed, Apr 21, 2021 at 9:31 AM Michael Wechner <michael.wech...@wyona.com>
wrote:

Hi

I recently found the following articles re Lucene/Solr and BERT

https://dmitry-kan.medium.com/neural-search-with-bert-and-solr-ea5ead060b28

https://medium.com/swlh/fun-with-apache-lucene-and-bert-embeddings-c2c496baa559

and would like to ask whether there might be more recent developments
within the Lucene/Solr community re BERT integration?

Also how these developments relate to

https://sbert.net/

?

Thanks very much for your insights!

Michael

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to