Well, I'm just wondering, in specific terms, if we use an object-based
storage system as an assetstore rather than a filesystem, where the
files that Lucene indexes actually sit.

It's my understanding that in a filesystem-based assetstore, for
example, text is extracted from PDFs and stored in a separate file
*within the assetstore directory* that Lucene crawls. I just don't know
how that sort of thing is handled when using object-based storage.

On Thu, 2007-05-03 at 13:28 -0400, Richard Rodgers wrote:
> Hi Cory:
> 
> Not sure about the limits of Lucene, but I think the larger point is
> that the back-ends are expected only to hold the real content or assets.
> Everything else (full-text indices and the like) are *artifacts* (can be
> recreated from the assets) that we don't need to manage in the same way.
> If for performance reasons we want to put them where the assets are we
> can, but there is really no connection between the two that the system
> imposes. 
> 
> Does this get at your question, or did I miss the point?
> 
> Thanks,
> 
> Richard R
> 
> On Thu, 2007-05-03 at 12:13 -0400, Cory Snavely wrote:
> > (Apologies if this has been discussed to resolution; after a few
> > attempts to search the archives, I concluded they are really broken. 500
> > errors, bad links, etc.)
> > 
> > For those using, interested in, or knowledgeable about using API-based
> > storage (SRB, S3) as a backend for DSpace: how does doing so affect
> > full-text indexing? Can anyone describe how, in such a setup, full text
> > is stored and indexed?
> > 
> > My uneducated impression is that Lucene would want to work only against
> > a filesystem.
> > 
> > Thanks,
> > Cory Snavely
> > University of Michigan Library IT Core Services
> > 
> > 
> > 
> > -------------------------------------------------------------------------
> > This SF.net email is sponsored by DB2 Express
> > Download DB2 Express C - the FREE version of DB2 express and take
> > control of your XML. No limits. Just data. Click to get it now.
> > http://sourceforge.net/powerbar/db2/
> > _______________________________________________
> > DSpace-tech mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/dspace-tech
> 


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to