I have been looking at making DSpace use an Amazon S3 Bucket when it stores a 
database entry so the Metadata goes into the oracle database on an Amazon RDS 
Instance and the content goes directly to the S3.

With s3cmd, using the http access to the S3 has much better performance than 
using the operating system call.

When DSpace writes the content external to the database, the command would be 
something like:
s3cmd put filename.pdf s3://bucket_name/subdir1/subdir2/filename.pdf

For DSpace to then get the same object from the S3 would be:
s3cmd get s3://bucket_name/subdir1/subdir2/filename.pdf filename.pdf

Note that I made it look like the S3 is a file system which it is not.  It is 
Flat Object Storage.  Doing it this way makes the S3 look like a file system to 
the end user.  Slash is just another Object name character in S3 storage.  When 
you see objects in that S3 bucket when done this way, it is very comfortable 
looking:
S3cmd ls s3://bucket_name/subdir1/subdir2
Date      Time      Size        s3://bucket_name/subdir1/subdir2/filename.pdf

Using the Operating System File access, s3fs, is really slow and not that 
reliable.  The mount tends to fail and has to be remounted from time to time.

I think the DSpace ItemImport.java class can be modified to write the external 
data to S3 this way while storing the S3 label to the object in the database.  
Has this been looked at in the past?  Is there a clean way to do it?  What 
about DSpace using the get, put, and ls options to s3cmd?

Has having DSpace store content in the Amazon S3 been looked at in the past?  
If so, how was it resolved?

Thank you.

Charles Keagle
Sr. Cloud Engineer | 2nd Watch
603 Stewart St, Suite 707 | Seattle, WA | 98101
Mobile 425-417-3434 | Office 888.747.8254
http://www.2ndwatch.com

------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Reply via email to