Simon,

Storing content in a Lucene index is a common approach and works well. 
I use a patch, LUCENE-362, to boost performance.  Compress and
decompress the field externally, storing just the byte[] in the Lucene
index.  The patch eliminates all copying of the byte[] otherwise done in
lucene, at the cost of supporting only one such a field per Document. 
As the patch is a bit older, you may need to "help" it apply to latest
source, if patch doesn't do it for you.

Chuck

Simon Willnauer wrote on 05/27/2006 01:33 PM:
> For those who haven't heard about the GData project please check
> today's mailing list  .
> The Lucene Indexer is supposed to be used as the search component of
> this implementation. As GData is an extension to the Atom/Rss format
> including search and a kind of versioning. This project is a server
> side implementation of the protocol. So what's the problem, the
> incoming feed entries and their updates have to be stored somewhere in
> a persistent storage. The easiest approach would be a flat file
> storage which is not sufficient in my eyes. I thought about using a
> similar approach to the Nutch dist. file system by Indexing the
> incoming entries in a "searchable" index and store the whole entry in
> an associated index to prevent the index from growing to fast.
> To keep the index small I would create a separate index for each feed
> instance which is organized in the local file system.
> I would be interested if anybody has experience with retrieving large
> data like whole feed entries out of a "storage" lucene  index. Am I
> supposed to face any performance problems with this approach?
> As far as I know lucene doesn't support any versioning or did that
> change by any chance? Well, the protocol description doesn't say
> anything about retrieving old versions.(the documentation only about
> optimistic locking / updating versions)
>
> regards Simon



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to