On 10/11/2013 03:04 PM, Adrien Grand wrote:
On Fri, Oct 11, 2013 at 7:03 PM, Michael Sokolov
<msoko...@safaribooksonline.com> wrote:
I've been running some tests comparing storing large fields (documents, say
100K .. 10M) as files vs. storing them in Lucene as stored fields.  Initial
results seem to indicate storing them externally is a win (at least for
binary docs which don't compress, and presumably we can compress the
external files if we want, too), which seems to make sense.  There will be
some issues with huge directories, but that might be worth solving.

So I'm wondering if there is a codec that does that?  I haven't seen one
talked about anywhere.
I don't know about any codec that works this way but such a codec
would quickly exceed the amount of available file descriptors.

I'm not sure I understand. I was thinking that the stored fields would be accessed infrequently (only when writing or reading the particular stored field value), and the file descriptor would only be in use during the read/write operation - they wouldn't be held open. So for example during query scoring one wouldn't need to visit these fields I think? But I may have a fundamental misunderstanding about how Lucene uses its codecs: this is new to me.

-Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to