Hi Stan, I played around with LIRE a couple years ago. I don't know exactly how it works, but it doesn't just use Lucene from what I remember, it has its own classes built around Lucene to perform the image search. There used to be a PDF of a paper on the site, but I couldn't find a link when I just looked, here's a quote from the search section of it:
"For search, classes implementing the ImageSearcher interface are used. The ImageSearcher either takes the given query feature or extracts the feature from a query image. It then reads documents from the index sequentially and compares them to the query image (linear search). Although the main indexing features of Lucene (e.g. an inverted list or stemming) are not employed in this kind of search, LIRe takes advantage of the efficient and fast disk access layer of Lucene, which results in lower search times compared to implementations using the embedded databases HSQLDB2, which is used in Open Office, and Apache Derby3, which is also included in the Java runtime releases as Java DB. Also the use of Lucene allows indexes bigger than common RAM restrictions (e.g. smaller than 2 GB on 32 bit Java) and additional indexing of textual metadata for the images." So it sounds like they're just using Lucene as a fast document store and then implementing their own matching if I understand that blurb correctly. Here's the github page of the project if you want to dig around in the code and see what they're actually doing. https://github.com/dermotte/LIRE Jim ________________________________________ From: Estanislao Oubel <estanislao.ou...@gmail.com> Sent: 06 August 2015 10:13 To: java-user@lucene.apache.org Subject: Re: How to index & search arrays of double? Thanks Phaneendra for responding, I know LIRE, I have been playing around with this library but I don't understand which is the added value. To be more specific, LIRE allows computing several image features and similarity between them, No problem so far. My main concern is that the index used by LIRE is a lucene index (at list in the examples). However, lucene index is an inverted index that seems suitable for indexing terms but it's not clear to me how arrays of values (LIRE features for example) are managed. What is even more strange is that, when searching a specific feature, this is compared to all documents in the index, and therefore I don't see which is the advantage of using a lucene index ... Perhaps I am missing something but my understanding is that an index should optimize the search of documents, which seems not to be the case ... If you have some experience with LIRE, could you please help me understand all this ? The one-millon question is: do I have to use necessarily LIRE to solve my specific problem? If you think that this topic is not suitable for the lucene forum please tell me and we could continue the discussion outside the mailing list. But I think that is of general interest because perhaps there are solutions using native lucene functions. Thanks! Stan 2015-08-06 10:48 GMT+02:00 Phaneendra N <phaneendran.gi...@gmail.com>: > Hello Stan, > Great question. I come across with one such implementation based on > lucene. Its called LIRE . > This is an open source project. http://www.lire-project.net/ > You might get some ideas there. > Please let me know if you find answers to your specific questions there. > I'm curious. > > Thanks > Phaneendra > > On Thu, Aug 6, 2015 at 12:39 PM, Estanislao Oubel < > estanislao.ou...@gmail.com> wrote: > > > Hello everybody, > > > > I'm currently investigating methods for content-based image retrieval. In > > this context, I would like to index documents containing arrays of > doubles > > and then perform an approximate search based on these arrays. For > example, > > I would like to insert in the index three documents (d1,d2,d3) > containing a > > field called feature1, a vector of doubles of dimension 3: > > > > d1_feature1 = [0.5 1.8 2.4]. > > d2_feature1 = [30.1 0 9.1]. > > d3_feature1 = [0.6 5.8 2.0]. > > > > Now, I would like that lucene gives me d1 when I search a document > > containing [0.51 1.79 2.41] (because d1 is the closest one according to a > > distance L1 for example). > > > > Is it possible to do this type of things with lucene? More specifically: > > 1. Does lucene support arrays of doubles as field type? > > 2. Is it possible to search documents based on custom distances between > > these arrays? > > > > If so, can you provide some clues about how to implement it? (fields > types > > and classes to use, or an example) > > > > Thanks! > > > > Stan > > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org