Hi there,

Thank you for the responses. Yes, we have a few scenarios in mind that can 
benefit from a vector-based index optimized for ANN searches:


  *   Advanced, optimized, and high precision visual search: For this to work, 
we would convert the images to their vector representations and then use 
algorithms and implementations such as 
SPTAG<https://github.com/Microsoft/SPTAG>, 
FAISS<https://github.com/facebookresearch/faiss>, and 
HNSWLIB<https://github.com/nmslib/hnswlib>.
  *   Advanced document retrieval: Using a numerical vector representation of a 
document, we could improve the search result
  *   Nearest neighbor queries: discovering the nearest neighbors to a given 
query could also benefit from these ANN algorithms (although doesn’t 
necessarily need the vector based index)

I would be grateful to hear your thoughts and whether the community is open to 
a conversation on this topic with my team.

Thanks,

Pedram

From: J. Delgado <joaquin.delg...@gmail.com>
Sent: Thursday, February 28, 2019 7:38 AM
To: dev@lucene.apache.org
Cc: Radhakrishnan Srikanth (SRIKANTH) <rsri...@microsoft.com>
Subject: Re: Vector based store and ANN

Lucene’s scoring function (which I believe is okapi BM25
https://en.m.wikipedia.org/wiki/Okapi_BM25<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.m.wikipedia.org%2Fwiki%2FOkapi_BM25&data=02%7C01%7Cpedramr%40microsoft.com%7C17ae8da7b7f345efa57c08d69d92bf60%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636869650947060423&sdata=Hhj8I07%2F%2F2dSctKqpd%2FV9aEWwAI0k2dmPVwXmYe9dQw%3D&reserved=0>)
 is a kind of nearest neighbor using the TF-IDF vector representation of 
documents and query. Are you interested in ANN to be applied to a different 
kind of vector representation, say for example Doc2Vec?

On Thu, Feb 28, 2019 at 5:59 AM Adrien Grand 
<jpou...@gmail.com<mailto:jpou...@gmail.com>> wrote:
Hi Pedram,

We don't have much in this area, but I'm hearing increasing interest
so it'd be nice to get better there! The closest that we have is this
class that can search for nearest neighbors for a vector of up to 8
dimensions: 
https://github.com/apache/lucene-solr/blob/master/lucene/sandbox/src/java/org/apache/lucene/document/FloatPointNearestNeighbor.java<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucene-solr%2Fblob%2Fmaster%2Flucene%2Fsandbox%2Fsrc%2Fjava%2Forg%2Fapache%2Flucene%2Fdocument%2FFloatPointNearestNeighbor.java&data=02%7C01%7Cpedramr%40microsoft.com%7C17ae8da7b7f345efa57c08d69d92bf60%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636869650947060423&sdata=bMGC8DVC8FMsK3mfatzDF9WU5VO8FCk6G%2F1IoviPvsU%3D&reserved=0>.

On Wed, Feb 27, 2019 at 1:44 AM Pedram Rezaei
<pedr...@microsoft.com.invalid<mailto:pedr...@microsoft.com.invalid>> wrote:
>
> Hi there,
>
>
>
> Is there a way to store numerical vectors (vector based index) and perform 
> search based on Approximate Nearest Neighbor class of algorithms in Lucene?
>
>
>
> If not, has there been any interests in the topic so far?
>
>
>
> Thanks,
>
>
>
> Pedram



--
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: 
dev-unsubscr...@lucene.apache.org<mailto:dev-unsubscr...@lucene.apache.org>
For additional commands, e-mail: 
dev-h...@lucene.apache.org<mailto:dev-h...@lucene.apache.org>

Reply via email to