Hi Klaus,
Don't you use clustering and quantize vectors to make visual bag of words?
If you do these, I don't think you need to worry about overhead to store
vectors to Lucene
because the number of clusters can be the ceiling of the number of words.
I used this technique in Apache alike which is a part of Apache Labs[1].
Apache alike uses Mahout for clustering of visual descriptors and Lucene for
searching
similar pictures. The architecture can be found at [2].
Koji
[1] http://labs.apache.org/labs.html
[2] http://svn.apache.org/repos/asf/labs/alike/trunk/alike-architecture.pptx
On 2017/12/13 18:28, Klaus Schaefers wrote:
Hi,
I would like to build an extension to use lucene for image retrieval. I would present each image as
a binary vector (visual bag of words). For now I can construct a string like "F1 F2 F10..." to
insert my bit vector into lucene. Off course this adds quite some overhead, so I was wondering if I
can directly write into the underlying storage engines...?
Cheers,
Klaus
--
“Overfitting” is not about an excessive amount of physical exercise...
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]