On Thu, Mar 28, 2013 at 11:06 PM, Paul <arach...@gmail.com> wrote: > Hi,
Hi Paul, > Some of the stuff I've read suggests that Lucene is not especially > well-suited to storing the documents. It's supposed to be great at indexing > those documents, but not so great at storing the docs themselves. > > Can someone shed some light on this? I'd say that it is the same problem as with other databases: The problem with large stored fields is that they might make the I/O cache of your operating system go crazy and make search slower. However, if your fields are small (ie. not high-resolution photos), I think it is reasonable to store them in the Lucene index, especially now that Lucene compresses stored fields. > If this is true, then am I right to think that the typical Lucene use case is > to > a. Index a document > b. Store in the index some kind of unique document identifier that > is meaningful to the > "native" application > c. Search the index, obtain this ID, and present it to the native > app to fetch the original > document? If you need to store your documents somewhere else anyway, this approach is good. But you could use Lucene as your primary store as well. > This came up in the context of trying to compare MongoDB and Lucene. But as I > dug into it I began to think that this might be an apples to oranges > comparison. MongoDB builds indices as you insert documents, but it seems like > Lucene is more about the indexing and less about storing documents. Lucene being only a library, you might be interested to check out Solr or ElasticSearch which are more comparable with MongoDB than Lucene. I hope this helps! -- Adrien --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org