On 5/4/07, Cory Snavely < [EMAIL PROTECTED]> wrote:
Well, I'm just wondering, in specific terms, if we use an object-based
storage system as an assetstore rather than a filesystem, where the
files that Lucene indexes actually sit.

Its tricky, this is what FilterMedia is for, it actually extracts the text and places it as a bitstream in the assetstore. Lucene full text indexing is done against the assetstore bitstreams in all cases (well accept for the metadata table in the database). So ultimately your pushing the text bitstreams into the assetstore (s3) in FilterMedia and pulling it back out on Lucene indexing, a double-whammy.

Cheers,
Mark


It's my understanding that in a filesystem-based assetstore, for
example, text is extracted from PDFs and stored in a separate file
*within the assetstore directory* that Lucene crawls. I just don't know
how that sort of thing is handled when using object-based storage.

On Thu, 2007-05-03 at 13:28 -0400, Richard Rodgers wrote:
> Hi Cory:
>
> Not sure about the limits of Lucene, but I think the larger point is
> that the back-ends are expected only to hold the real content or assets. > Everything else (full-text indices and the like) are *artifacts* (can be > recreated from the assets) that we don't need to manage in the same way. > If for performance reasons we want to put them where the assets are we > can, but there is really no connection between the two that the system
> imposes.
>
> Does this get at your question, or did I miss the point?
>
> Thanks,
>
> Richard R
>
> On Thu, 2007-05-03 at 12:13 -0400, Cory Snavely wrote:
> > (Apologies if this has been discussed to resolution; after a few
> > attempts to search the archives, I concluded they are really broken. 500
> > errors, bad links, etc.)
> >
> > For those using, interested in, or knowledgeable about using API-based > > storage (SRB, S3) as a backend for DSpace: how does doing so affect > > full-text indexing? Can anyone describe how, in such a setup, full text
> > is stored and indexed?
> >
> > My uneducated impression is that Lucene would want to work only against
> > a filesystem.
> >
> > Thanks,
> > Cory Snavely
> > University of Michigan Library IT Core Services
> >
> >
> >
> > ---------------------------------------------------------------------- ---
> > This SF.net email is sponsored by DB2 Express
> > Download DB2 Express C - the FREE version of DB2 express and take
> > control of your XML. No limits. Just data. Click to get it now.
> > http://sourceforge.net/powerbar/db2/
> > _______________________________________________
> > DSpace-tech mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/dspace-tech
>


---------------------------------------------------------------------- ---
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

---------------------------------------------------------------------- ---
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/ _______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

~~~~~~~~~~~~~
Mark R. Diggory - DSpace Systems Manager
MIT Libraries, Systems and Technology Services
Massachusetts Institute of Technology


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to