Hi,

It's perfectly fine to store binary blobs in Lucene. This does not affect 
performance of queries. The stored data is also compressed using LZ4.

Just one thing: why the hell UUEncode? You can store binary blobs as is. Just 
pass a byte[] as stored field. There is one StoredField constructor to put a 
byte array. If you get it from Indexreader it's received as byte array, too. 
That's the most efficient way to encode it.

No need for a side database.

Uwe

Am December 2, 2018 9:20:13 AM UTC schrieb Joe MA <mrj...@comcast.net>:
>Greetings,
>
>I have an index where I import documents such as powerpoint, PDF, and
>so forth.  One nice feature I added is that for each document, I store
>a thumbnail of the first page as an encoded String (uuencode) using a
>stored,not-indexed field.  This thumbnail gets displayed when the user
>finds a document.   
>
>I am wondering if, as the size of the index grows to perhaps hundreds
>of thousands if not millions of documents,  how efficient is this?  Is
>it a good idea?
>These encoded strings could be several hundred bytes in size, and of
>course are completely unique for each file indexed, and provide no
>'search' value.  On the surface, it seems like there could be a better
>way to do this given the size, as well as the extra retrieval time for
>Lucene to pull these fields for found documents.
>
>Since I also have a unique hash for each document in the index, it
>would not be too difficult to set up a separate, independent NoSQL
>key/value store with the thumbnail images, such as MongoDB or similar,
>and then retrieve the thumbnails from that store instead of keeping
>them in the Lucene index.  Does this seem like a better approach? Or is
>Lucene stored field retrieval efficient enough that there would be no
>benefit to doing this?  Any other ideas?
>
>Thanks in advance,
>J
>
>
>  
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>For additional commands, e-mail: java-user-h...@lucene.apache.org

--
Uwe Schindler
Achterdiek 19, 28357 Bremen
https://www.thetaphi.de

Reply via email to