Hi Joseph,

Going with S3 would actually be a great way to break the "we can't put
it in the repository because we'll run out of disk space" barrier, and
for cheap. Many repo admins will also be likely to consult the
"trusted repository" handbook, as well as your legal rights to move
the files uploaded to other storage silos. As a result, our university
has a massive data center (and massive data center costs).

The s3fs option sounds the most likely to accomplish quickly,
refactoring DSpace to have a pluggable asset-storage system, and then
implementing it for s3 would take some effort, however, hopefully
someone more knowledgeable can chime in. (there may be some prior art)

The downsides of having to make a network connection for disk access
is when you do a index-init, or filtermedia, and have to do network
request to do what typically are fast disk accesses. It should work
fine, but those tasks will be much slower. This point is likely less
of a problem if you're going with Amazon EC2.

All that said, having s3 would be useful for managing multiple
development environments, where rsyncing productions assetstore to an
external drive connected to each computer becomes a chore. Not sure if
rsync to s3 is much better though.

Also, the demo.dspace.org site resides wholly in Amazon EC2 with
likely an EBS filesystem. So theres nothing wrong with Amazon, just
whatever solution you use, the distance between your virtual CPU and
virtual disk should be as close as possible. Or, perhaps as close as
possible to the end user.

@Hardy, I don't think the 64GB max per file is going to slow me down
any. Our entire repo is about that size, and thats thousands of files.

On 3/18/11, Pottinger, Hardy J. <pottinge...@umsystem.edu> wrote:
> Hi, I'm certainly not an expert in this area, but from my quick read,
> depending on the use case for your repository, this looks like something
> that might work. One thing to be aware of is the 64GB max file size imposed
> by s3fs, and the potential for S3's "Eventual Consistency" model to cause
> "problems" with user submissions. More details on the s3fs wiki:
> http://code.google.com/p/s3fs/wiki/EventualConsistency
>
> I'm interesting in hearing more about this, if anyone has actually played
> around with putting an assetstore on s3fs.
>
> --Hardy
>
>> -----Original Message-----
>> From: Joseph Rhoads [mailto:jrho...@westga.edu]
>> Sent: Friday, March 18, 2011 1:12 PM
>> To: dspace-devel@lists.sourceforge.net
>> Subject: [Dspace-devel] Using Amazon S3 for an Assetstore
>>
>> I've seen some talk about integrating Amazon S3 as an assetstore (or
>> bitstream store as it's sometimes called).
>>
>>
>>
>> Has anyone tried using something like s3fs, a "FUSE-based file system on
>> Amazon" ?
>>
>> (I know there are several flavors of the same idea around but
>> http://code.google.com/p/s3fs/ seems like a fairly mature one.  Another
>> is http://code.google.com/p/s3ql/ )
>>
>>
>>
>> And just using a directory the mounted fs as your directory for the
>> assetstore.
>>
>> Are there subtlties that I haven't noticed (after a 10 minute first
>> glance) that would make it apparent that this is a bad idea?
>>
>> Has anyone done this successfully?
>>
>>
>>
>> -Joseph
>
>
> ------------------------------------------------------------------------------
> Colocation vs. Managed Hosting
> A question and answer guide to determining the best fit
> for your organization - today and in the future.
> http://p.sf.net/sfu/internap-sfd2d
> _______________________________________________
> Dspace-devel mailing list
> Dspace-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-devel
>


-- 
Peter Dietz

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel

Reply via email to