Hi Richard: A a quick reaction to your questions - I'll look into it more - is this: in principle it would certainly be doable, but the issue will likely be tolerance for performance tradeoffs. In my prototype I preserved the stream-oriented aspect of the API: which means I don't store a local copy of the asset file before shipping it off to S3. Fortunately S3 returns an MD-5 of the contents it receives. Certain types of compression/and or encryption may want to have a view of the whole file to do their work: if so, then the bitstore would have to use a temporary location, receive the whole file, and then resend it, which would obviously double the transfer time. But some compression/crypto schemes don't work that way, so maybe we could be OK.
Thanks, Richard On Thu, 2007-04-19 at 21:23 +1200, Richard MAHONEY wrote: > Dear Richard, > > On Thu, 2007-04-19 at 04:23, Richard Rodgers wrote: > > Richard: > > > > I'm putting up a prototype implementation of (inter alia) an S3 backend > > on the DSpace wiki. (see 'PluggableStorage' page). Would love volunteers > > to vet it (not ready for production). > > > > Thanks, > > > > Richard R. > > Without wanting to sound overly effusive, I'd just like to say how > deeply grateful I am that you are working on the Amazon S3 bitstore. > This is all very exciting and I hope to experiment with S3BitStore once > I am finished migrating Indica et Buddhica to Joyent/TextDrive, > hopefully by the end of the month.** ... Something I'd like to ask before > then though. > > Presently all the material I hold on S3 consists of encrypted > compressed tar balls (Solaris 10: gtar, bzip2, encrypt). These can be > created using UNIX pipes, similar to producing encrypted tape backups. > How hard would it be, then, to use S3BitStore to send encrypted, > possibly compressed, data to an assetstore on S3? I already send and > retrieve all material using SSL. It seems to me that the addition of > data encryption and compression would certainly go some way to > reassuring an institution wishing to archive sensitive material, cost > effectively. Would all of this be non-trivial? Any thoughts. > > > Kind regards, > > Richard M. > > > ** I think I recall reading a while ago on this list about firms, > notably TextDrive, being unwilling to host Java apps. It seemed that if > one wished to run DSpace one needed a dedicated machine. This is no > longer the case. See Joyent/TextDrive's Accelerators: > > http://radiant.joyent.com/accelerator/ > > > > > On Thu, 2007-04-12 at 09:49 +1200, Richard MAHONEY wrote: > > > Dear Robert et al., > > > > > > On Thu, 2007-04-12 at 07:15, Robert Tansley wrote: > > > > We considered this way back when (2001); we decided on using the > > > > filesystem because some files might be very very large, there might be > > > > lots of them and in general it's easier to split filesystem-based > > > > asset stores across multiple drives/machines than a big relational > > > > database. > > > > > > > > That said, the intention was that storage would be made pluggable -- > > > > so you could have RDBMS, SRB/iRODs, open-source GoogleFileSystem, > > > > LOCKSS-ish etc. storage. That pluggability ended up being one of the > > > > many non-critical-for-version-1 features we had to drop to get DSpace > > > > 1.0 finished :-) There are some projects (e.g. the MIT ones) looking > > > > at how to really accomplish this. > > > > > > Over the past few weeks I've been using Amazon's Simple Storage Service > > > (S3): > > > > > > http://www.amazon.com/gp/browse.html?node=16427261 > > > > > > At this point I've merely been using it to backup web servers and > > > development directories. This has involved the simple upload of > > > compressed tarballs (using the Java app. jSh3ll) but also the > > > synchronising of file systems (using the Ruby app. s3sync). > > > > > > In all, I've been pleasantly surprised by the results. It would seem > > > that the S3 storage system promises to be more resilient than anything > > > I could build at a reasonable cost. > > > > > > Although I've only been using S3 for remote backup, it seems that it > > > can also be used as a live file system for storing and retrieving data > > > for web apps. I am wondering then, if anyone, may be able to suggest > > > how it might be possible to configure (cajole) DSpace-1.4 into using S3 > > > as an assetstore. The Amazon blurb says that S3: > > > > > > `Uses standards-based REST and SOAP interfaces designed to work with any > > > Internet-development toolkit.' > > > > > > > > > Best regards, > > > > > > Richard MAHONEY > > > > > > > > > > -- > Richard MAHONEY | internet: http://indica-et-buddhica.org/ > Littledene | telephone/telefax (man.): +64 3 312 1699 > Bay Road | cellular: +64 27 482 9986 > OXFORD, NZ | email: [EMAIL PROTECTED] > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Indica et Buddhica: Materials for Indology and Buddhology > Repositorium: http://indica-et-buddhica.org/repositorium/ > Philologica: http://indica-et-buddhica.org/philologica/ > Subscriptions: http://subscriptions.indica-et-buddhica.org/ > ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech

