Re: Storing Documents in Lucene

Adrien Grand Thu, 28 Mar 2013 18:25:22 -0700

On Thu, Mar 28, 2013 at 11:06 PM, Paul <arach...@gmail.com> wrote:
> Hi,


Hi Paul,

> Some of the stuff I've read suggests that Lucene is not especially 
> well-suited to storing the documents. It's supposed to be great at indexing 
> those documents, but not so great at storing the docs themselves.
>
> Can someone shed some light on this?

I'd say that it is the same problem as with other databases: The
problem with large stored fields is that they might make the I/O cache
of your operating system go crazy and make search slower. However, if
your fields are small (ie. not high-resolution photos), I think it is
reasonable to store them in the Lucene index, especially now that
Lucene compresses stored fields.

> If this is true, then am I right to think that the typical Lucene use case is 
> to
>          a. Index a document
>          b. Store in the index some kind of unique document identifier that 
> is meaningful to the
>               "native" application
>          c. Search the index, obtain this ID, and present it to the native 
> app to fetch the original
>                document?

If you need to store your documents somewhere else anyway, this
approach is good. But you could use Lucene as your primary store as
well.

> This came up in the context of trying to compare MongoDB and Lucene. But as I 
> dug into it I began to think that this might be an apples to oranges 
> comparison. MongoDB builds indices as you insert documents, but it seems like 
> Lucene is more about the indexing and less about storing documents.

Lucene being only a library, you might be interested to check out Solr
or ElasticSearch which are more comparable with MongoDB than Lucene.

I hope this helps!

-- 
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Storing Documents in Lucene

Reply via email to