Re: [Dspace-tech] DSpace optimization
Dear Jayan, On Tue, 2007-07-24 at 20:25, James Dickson wrote: Hi Jayan, What part of dspace are you having difficulty with? Browse, Search, Indexing.. For indexing we have implemented a batch indexing process that is not so memory intensive as the existing one. There are a few tweaks that can be performed too speed up the browsing. Unfortunately, throwing more memory will increase the number of concurrent user you can serve, but will not really have much effect on performance. James Jayan Chirayath Kurian wrote: Hi! Can anyone suggest how to allocate more memory to Tomcat and postgreSQL for a server with 1 GB ram, 300 GB hard disk and 170,000 records? Will allocating memory improve the client access speed? Something that _may_ help -- and from memory, rarely mentioned -- is ditching Tomcat and using a decent Web or Application Server. Personally, I've never considered using anything but Sun's Java System Web Server: v 6.1 and now 7.0. The latest incarnation has the option to pre-compile JSPs during deployment. This seems to significantly improve performance. As to solving the memory and CPU requirements of PostgreSQL, well the short answer is to ditch that too, and move to a pure file system based solution ;) Its my hope that DSpace will eventually make this jump, but unfortunately I have not been in a position to wait. On a machine with similar resources to your own, and well before reaching your 170,000 records, I became so frustrated with the sluggishness of the DSpace batch import system that I moved the vast majority of my content -- mostly bibliographical records -- to a Zebra server fronted by a YAZ/PHP interface. For the first time in a long while I am confident that my system will continue to scale well beyond my anticipated needs. I'm still using DSpace to archive a relatively small collection of digital material, and the general performance is perfectly acceptable. But in its present form, I have given up the hope of using DSpace for anything more than 20,000 items or so. That said, I am sure DSpace will be worth revisiting once the planned architectural developments have come to pass. Best regards, Richard -- Richard MAHONEY | internet: http://indica-et-buddhica.org/ Littledene | telephone/telefax (man.): +64 3 312 1699 Bay Road| cellular: +64 27 482 9986 OXFORD, NZ | email: [EMAIL PROTECTED] ~~~ Indica et Buddhica: Materials for Indology and Buddhology Scholia: http://scholia.indica-et-buddhica.org/ Subscriptions: http://subscriptions.indica-et-buddhica.org/ - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Dspace on Sun SPARC
Dear Mika, On Thu, 2007-06-21 at 03:07, Mika Stenberg wrote: Has anyone managed to run DSpace on Sun Sparc servers? I was hoping to, but it seems there is no Java available for Linux on SPARC architecture. Any experience on this? University of Auckland: Solaris 10 / Sun Fire E25K (My God!) University of Canterbury: Solaris 10 / Sun Fire 440 See: ResearchSpace at Auckland - Disaster Recovery (DR) - Yin Yin Latt (http://www.ira.auckland.ac.nz/seminar/) Why, may I ask, would you want to run Linux on SPARC in preference to Solaris? Best regards, Richard -- Richard MAHONEY | internet: http://indica-et-buddhica.org/ Littledene | telephone/telefax (man.): +64 3 312 1699 Bay Road| cellular: +64 27 482 9986 OXFORD, NZ | email: [EMAIL PROTECTED] ~~~ Indica et Buddhica: Materials for Indology and Buddhology Repositorium: http://indica-et-buddhica.org/repositorium/ Philologica: http://indica-et-buddhica.org/philologica/ Subscriptions: http://subscriptions.indica-et-buddhica.org/ - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [Dspace-general] DSpace `Dublin Core' | Date Issued | Date Range | How to represent
Hi Scott, On Wed, 2007-06-13 at 10:44, Scott Yeadon wrote: Granted that using something such as: dcvalue element=date qualifier=issued1964-1970/dcvalue in one's `dublin_core.xml' file seems practical and expedient, on my system at least -- DSpace-1.4.0 -- such an approach breaks DSpace's `Browse by Title', `Browse by Date', and the offending item's `Brief View'. This is the reason I first asked the lists for details of how one should _correctly_ represent a date range in the DSpace `dublin_core.xml' file. Using `1964-1970' and so on simply does not seem to work. It's likely that this is because the default metadata display is not able to render date ranges properly. In your DSpace config file put the following entry: webui.itemdisplay.default = ... dc.date.issued, ... done The date.issued field is by default formatted to a date (see ItemTag.java for the hardcoded list) using the dc.date.issued(date) field display text. Removing the (date) part of this will stop any special rendering taking place. Also, setting: webui.itemlist.columns = dc.date.issued, ... ditto in the dspace.cfg file may also resolve your ranges not showing up in the browse page (the default specifies dc.date.issued(date)), so as above removing the rendering rules should fix this) Thank you very much for this suggestion Scott. With a small test sample this makes all the ranges visible when browsing. Once I've prepared and loaded a decent number of records following the `1964-1970' pattern I'll test the sorting and so on. Kind regards, Richard -- Richard MAHONEY | internet: http://indica-et-buddhica.org/ Littledene | telephone/telefax (man.): +64 3 312 1699 Bay Road| cellular: +64 27 482 9986 OXFORD, NZ | email: [EMAIL PROTECTED] ~~~ Indica et Buddhica: Materials for Indology and Buddhology Repositorium: http://indica-et-buddhica.org/repositorium/ Philologica: http://indica-et-buddhica.org/philologica/ Subscriptions: http://subscriptions.indica-et-buddhica.org/ - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [Dspace-general] DSpace `Dublin Core' | Date Issued | Date Range | How to represent
Hello Scott, Thanks for your note. On Tue, 2007-06-12 at 12:06, Scott Yeadon wrote: Hi Richard, It's up to you how you represent your values, you could use the DCMI Period or something simple such as 1930-1940. We tend to have the latter since that's what our users typically enter. The batch import process won't parse the values, as long as the document is valid XML the values will be accepted. Granted that using something such as: dcvalue element=date qualifier=issued1964-1970/dcvalue in one's `dublin_core.xml' file seems practical and expedient, on my system at least -- DSpace-1.4.0 -- such an approach breaks DSpace's `Browse by Title', `Browse by Date', and the offending item's `Brief View'. This is the reason I first asked the lists for details of how one should _correctly_ represent a date range in the DSpace `dublin_core.xml' file. Using `1964-1970' and so on simply does not seem to work. I have put together a series of screenshots to indicate the issues: http://indica-et-buddhica.org/sections/repositorium-preview/known-issues/dspace-item-date-ranges As you will see, I am - unhappily - coming to the conclusion that DSpace does not support item date ranges at all. It is also becoming clear that the lack of genuine validation by the item importer can easily lead to the widespread corruption of ones metadata. I hope I am wrong as these would be serious deficiencies. Best regards, Richard Mahoney Scott. Message: 3 Date: Fri, 08 Jun 2007 12:08:16 +1200 From: Richard MAHONEY [EMAIL PROTECTED] Subject: [Dspace-general] DSpace `Dublin Core' | Date Issued | Date Range | How to represent To: DSpace Tech dspace-tech@lists.sourceforge.net,DSpace General [EMAIL PROTECTED] Message-ID: [EMAIL PROTECTED] Content-Type: text/plain Dear List Members, I am in the process or preparing material for bulk import and have again encountered and issue that I was inclined to gloss over last time it arose: the format of the DSpace Dublin Core Date Elements, Qualifiers, and particularly, the Values. What exactly is the required Value format and is it configurable? Simple date Values such as the following present no difficulty: dcvalue element=date qualifier=issued1970/dcvalue The trouble for me -- and this situation would arise often for many projects -- is how to correctly represent date ranges, for e.g., date issued, 1964 to 1970. Which Value format should should be used to represent a date range in DSpace DC? Some DSpace version of the W3C-DTF/ISO 8601 scheme? http://dublincore.org/documents/2000/07/28/dcmi-period/ Best regards, Richard Mahoney -- Richard MAHONEY | internet: http://indica-et-buddhica.org/ Littledene | telephone/telefax (man.): +64 3 312 1699 Bay Road| cellular: +64 27 482 9986 OXFORD, NZ | email: [EMAIL PROTECTED] ~~~ Indica et Buddhica: Materials for Indology and Buddhology Repositorium: http://indica-et-buddhica.org/repositorium/ Philologica: http://indica-et-buddhica.org/philologica/ Subscriptions: http://subscriptions.indica-et-buddhica.org/ - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [Dspace-general] DSpace `Dublin Core' | Date Issued | Date Range | How to represent
Hello Paulo Jobim, On Wed, 2007-06-13 at 03:01, instituto A.C.Jobim wrote: Hi Richard I think using the date.issued field quite confusing because thats what DSpace automatically uses when you don't check the item has been published before box. My use of `date.issued' for previously published items is consistent with the recommendations here: http://www.dspace.org/technology/metadata.html A short summary of the suggested element / qualifier pairs follows: 1.) date -- Use qualified form if possible 2.) date accessioned -- Date DSpace takes possession of item 3.) date available -- Date or date range item became available to the public 4.) date copyright -- Date of copyright 5.) date created -- Date of creation or manufacture of intellectual content if different from date.issued 6.) date issued -- Date of publication or distribution 7.) date submitted -- Recommend for theses and dissertations Clearly, dc.date.issued is to be used for the date of `original' publication in the case of previously published material. dc.date.created, on the other hand, is only to be used in addition, and not as a substitute for, dc.date.issued. At the Institute here we used date.created and maybe I will change it for simply date. As above, probably its undoubtably best to use a qualified form if you can. Researchers here use sometimes brackets or parentesis to indicate if a date is a guess of the researcher and all these things break the browse by date page and make the sorting alleatory. We finally decided to treat this field as text and not date (in dspace.cfg) so I remove the brackets in the field sort_date and everybody uses the -MM-DD format so periods like 1960-1970 will be sorted correctly and the browse page will not break. The primary issue for me is how to represent date ranges in the dublin_core.xml file so that DSpace can adequately sort and display the range in `Browse by title' and `Browse by date', and display the range in `Brief item view'. If I properly understand the relatively few responses to my query, the short answer is that one cannot. The stock DSpace distribution does not enable one to specify a date range in an item's date metadata fields, for e.g., something along the lines of the DCMI Period Encoding Scheme: http://dublincore.org/documents/dcmi-period/ This is a serious deficiency in any system, let alone in one that has pretensions to provide a basis for a digital archive. I would welcome comments from DSpace core developers on their proposed solution to DSpace's lack of support for encoding periods. Best regards, Richard Mahoney I hope this helps Paulo Jobim Em 12/06/2007, às 07:34, Richard MAHONEY escreveu: Hello Scott, Thanks for your note. On Tue, 2007-06-12 at 12:06, Scott Yeadon wrote: Hi Richard, It's up to you how you represent your values, you could use the DCMI Period or something simple such as 1930-1940. We tend to have the latter since that's what our users typically enter. The batch import process won't parse the values, as long as the document is valid XML the values will be accepted. Granted that using something such as: dcvalue element=date qualifier=issued1964-1970/dcvalue in one's `dublin_core.xml' file seems practical and expedient, on my system at least -- DSpace-1.4.0 -- such an approach breaks DSpace's `Browse by Title', `Browse by Date', and the offending item's `Brief View'. This is the reason I first asked the lists for details of how one should _correctly_ represent a date range in the DSpace `dublin_core.xml' file. Using `1964-1970' and so on simply does not seem to work. I have put together a series of screenshots to indicate the issues: http://indica-et-buddhica.org/sections/repositorium-preview/known- issues/dspace-item-date-ranges As you will see, I am - unhappily - coming to the conclusion that DSpace does not support item date ranges at all. It is also becoming clear that the lack of genuine validation by the item importer can easily lead to the widespread corruption of ones metadata. I hope I am wrong as these would be serious deficiencies. Best regards, Richard Mahoney Scott. Message: 3 Date: Fri, 08 Jun 2007 12:08:16 +1200 From: Richard MAHONEY [EMAIL PROTECTED] Subject: [Dspace-general] DSpace `Dublin Core' | Date Issued | Date Range | How to represent To: DSpace Tech dspace-tech@lists.sourceforge.net, DSpace General [EMAIL PROTECTED] Message-ID: [EMAIL PROTECTED] Content-Type: text/plain Dear List Members, I am in the process or preparing material for bulk import and have again encountered and issue that I was inclined to gloss over last time it arose: the format of the DSpace Dublin Core Date Elements, Qualifiers, and particularly, the Values. What exactly is the required Value format and is it configurable? Simple date Values such as the following
[Dspace-tech] DSpace Dublin Core DTD or Schemas | Where are they?
Dear Readers, I'm preparing several thousand records for bulk import and wish to validate all the `dublin_core.xml' files against the DSpace Dublin Core Schema. Unfortunately, having gone through the source code of 1.4.0 I can't find any DTD, or XML or other type of Schema against which to validate `dublin_core.xml'. Surely some sort of DTD or Schema must exist, if only to check the validity bulk imported material? Any pointers would be much appreciated. Best regards, Richard Mahoney -- Richard MAHONEY | internet: http://indica-et-buddhica.org/ Littledene | telephone/telefax (man.): +64 3 312 1699 Bay Road| cellular: +64 27 482 9986 OXFORD, NZ | email: [EMAIL PROTECTED] ~~~ Indica et Buddhica: Materials for Indology and Buddhology Repositorium: http://indica-et-buddhica.org/repositorium/ Philologica: http://indica-et-buddhica.org/philologica/ Subscriptions: http://subscriptions.indica-et-buddhica.org/ - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] DSpace Dublin Core DTD or Schemas | Where are they?
On Fri, 2007-06-08 at 13:13, Pollard, Marvin wrote: Richard, Although this might not qualify as a DTD it might help you to move along in your bulk loading project: http://www.dspace.org/technology/metadata.html Thank you for the pointer Marvin. I should perhaps have said that I had searched the documentation and web, and that this was one of first pieces I consulted ;) The trouble is that I want to be able to ensure that not only the DSpace DC Elements and Qualifiers are valid but also the Values -- e.g. date values. I am beginning to wonder if the DSpace bulk uploader does any serious validity checking on the `dublin_core.xml' file at all. Last time I bulk uploaded a decent amount of material I noticed that I could feed in all manner of date issued data. All was accepted and the UI rendering predictably poor. I want to ensure more consistency this time round, hence my request for the `Definitive DSpace DTD or Schema'. Best Richard -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Richard MAHONEY Sent: Thursday, June 07, 2007 5:42 PM To: DSpace Tech; DSpace General Subject: [Dspace-tech] DSpace Dublin Core DTD or Schemas | Where are they? Dear Readers, I'm preparing several thousand records for bulk import and wish to validate all the `dublin_core.xml' files against the DSpace Dublin Core Schema. Unfortunately, having gone through the source code of 1.4.0 I can't find any DTD, or XML or other type of Schema against which to validate `dublin_core.xml'. Surely some sort of DTD or Schema must exist, if only to check the validity bulk imported material? Any pointers would be much appreciated. Best regards, Richard Mahoney -- Richard MAHONEY | internet: http://indica-et-buddhica.org/ Littledene | telephone/telefax (man.): +64 3 312 1699 Bay Road| cellular: +64 27 482 9986 OXFORD, NZ | email: [EMAIL PROTECTED] ~~~ Indica et Buddhica: Materials for Indology and Buddhology Repositorium: http://indica-et-buddhica.org/repositorium/ Philologica: http://indica-et-buddhica.org/philologica/ Subscriptions: http://subscriptions.indica-et-buddhica.org/ - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech -- Richard MAHONEY | internet: http://indica-et-buddhica.org/ Littledene | telephone/telefax (man.): +64 3 312 1699 Bay Road| cellular: +64 27 482 9986 OXFORD, NZ | email: [EMAIL PROTECTED] ~~~ Indica et Buddhica: Materials for Indology and Buddhology Repositorium: http://indica-et-buddhica.org/repositorium/ Philologica: http://indica-et-buddhica.org/philologica/ Subscriptions: http://subscriptions.indica-et-buddhica.org/ - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [Dspace-general] DSpace: digital archive or literature archive?
Hello Derek, On Fri, 2007-06-01 at 21:38, Derek Hohls wrote: Richard Thanks for sharing those ideas and thoughts. I looked at the Nuxeo site, and also read through the technical comparison by Richard Wyles - very interesting. I also looked the Fedora case study implementation by Richard Green In summary, I have gathered that: * DSpace is less technically capable, does not scale as well, does not handle complex objects or variety of objects, or mass-uploading of data, but has an easy and simple front-end for users and administrators. There is also a wealth of start-up material and a good community. * Fedora is more technically capable, scales well (within our likely limits at least), seems to handle complex objects with a variety of data types - MIME- based. There is no front-end that works on the web; and the Java interface that is supplied looks absolutely barebones at best. The concepts and ideas of Fedora also seem quite complex and are not clearly explained in the starting documentation. User docs and tutorials seem minimal. Community support is unknown. Richard Green's case study says: Fedora 'out of the box' was a software tool with an associated very steep learning curve and a user had to rely heavily on documentation available on the Fedora website... we came to realise that the documentation appeared to lack some crucial elements and that, for a first time user, it was sometimes not easy to follow. This leaves us in a difficult position between two choices; (a) to hold off and hope for Fedora to significantly improve the front end and user documentation... which might be problematic as its not clear how there funding will continue after September this year (2007), and there is no project roadmap, so its not that clear as to what they will actually focus on. (b) to go on with DSpace, and acknowledge that its a temporary solution which may not adequately address many of our use cases (although still a step up from holding all research data on local drives or on a DMS). if we later decide to switch to Fedora, I hope it would be possible to extract the content out for the new system. DSpace says: http://wiki.dspace.org/index.php//EndUserFaq#Can_I_export_my_digital_material_out_of_DSpace.3F this is possible Another option -- which I forgot to mention -- may be MyCoRe, at least once the interface and documentation are available in English (anticipated): About MyCoRe: http://www.mycore.de/content/main/information.xml Features: http://www.mycore.de/content/main/information/description.xml Applications (Deployments): http://www.mycore.de/content/main/anwendungen.xml MyCoRe Documentation: http://www.mycore.de/content/main/documentation.xml Note the commitment to support enterprise grade databases, support for audio and video streaming, and an Z39.50 interface. Best regards, Richard Mahoney -- Richard MAHONEY | internet: http://indica-et-buddhica.org/ Littledene | telephone/telefax (man.): +64 3 312 1699 Bay Road| cellular: +64 27 482 9986 OXFORD, NZ | email: [EMAIL PROTECTED] ~~~ Indica et Buddhica: Materials for Indology and Buddhology Repositorium: http://indica-et-buddhica.org/repositorium/ Philologica: http://indica-et-buddhica.org/philologica/ Subscriptions: http://subscriptions.indica-et-buddhica.org/ - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] [Dspace-general] DSpace: digital archive or literature archive?
/products/ http://www.nuxeo.org/static/snapshots/ (Download Daily Snapshots) v.) Nuxeo 5 Roadmap http://www.nuxeo.org/sections/about/roadmap/ vi.) Nuxeo Clients: http://www.nuxeo.com/en/customers/ vii.) Mailing Lists (Nuxeo 5): http://lists.nuxeo.com/mailman/listinfo/ecm Best regards, Richard Mahoney -- Richard MAHONEY | internet: http://indica-et-buddhica.org/ Littledene | telephone/telefax (man.): +64 3 312 1699 Bay Road| cellular: +64 27 482 9986 OXFORD, NZ | email: [EMAIL PROTECTED] ~~~ Indica et Buddhica: Materials for Indology and Buddhology Repositorium: http://indica-et-buddhica.org/repositorium/ Philologica: http://indica-et-buddhica.org/philologica/ Subscriptions: http://subscriptions.indica-et-buddhica.org/ - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] seeking sample records suggestions for Portable Citations (Summer of Code)
, type = Wishful Research Result, number = 7, address = Computer Science Department, Fanstord, California, month = oct, year = 1988, note = This is a full TECHREPORT entry, } @UNPUBLISHED{unpublished-minimal, author = Ulrich {\{U}}nderwood and Ned {\~N}et and Paul {\={P}}ot, title = Lower Bounds for Wishful Research Results, note = Talk at Fanstord University (this is a minimal UNPUBLISHED entry), } @UNPUBLISHED{unpublished-full, author = Ulrich {\{U}}nderwood and Ned {\~N}et and Paul {\={P}}ot, title = Lower Bounds for Wishful Research Results, month = nov # , # dec, year = 1988, note = Talk at Fanstord University (this is a full UNPUBLISHED entry), } @MISC{random-note-crossref, key = {Volume-2}, note = Volume~2 is listed under Knuth \cite{book-full} } end xampl.bib Thanks for taking up this project, and best of luck. Best regards, Richard Mahoney -- Richard MAHONEY | internet: http://indica-et-buddhica.org/ Littledene | telephone/telefax (man.): +64 3 312 1699 Bay Road| cellular: +64 27 482 9986 OXFORD, NZ | email: [EMAIL PROTECTED] ~~~ Indica et Buddhica: Materials for Indology and Buddhology Repositorium: http://indica-et-buddhica.org/repositorium/ Philologica: http://indica-et-buddhica.org/philologica/ Subscriptions: http://subscriptions.indica-et-buddhica.org/ - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Assetstore physical storage (Amazon's Simple Storage Service: S3)
Dear Richard, On Thu, 2007-04-19 at 04:23, Richard Rodgers wrote: Richard: I'm putting up a prototype implementation of (inter alia) an S3 backend on the DSpace wiki. (see 'PluggableStorage' page). Would love volunteers to vet it (not ready for production). Thanks, Richard R. Without wanting to sound overly effusive, I'd just like to say how deeply grateful I am that you are working on the Amazon S3 bitstore. This is all very exciting and I hope to experiment with S3BitStore once I am finished migrating Indica et Buddhica to Joyent/TextDrive, hopefully by the end of the month.** ... Something I'd like to ask before then though. Presently all the material I hold on S3 consists of encrypted compressed tar balls (Solaris 10: gtar, bzip2, encrypt). These can be created using UNIX pipes, similar to producing encrypted tape backups. How hard would it be, then, to use S3BitStore to send encrypted, possibly compressed, data to an assetstore on S3? I already send and retrieve all material using SSL. It seems to me that the addition of data encryption and compression would certainly go some way to reassuring an institution wishing to archive sensitive material, cost effectively. Would all of this be non-trivial? Any thoughts. Kind regards, Richard M. ** I think I recall reading a while ago on this list about firms, notably TextDrive, being unwilling to host Java apps. It seemed that if one wished to run DSpace one needed a dedicated machine. This is no longer the case. See Joyent/TextDrive's Accelerators: http://radiant.joyent.com/accelerator/ On Thu, 2007-04-12 at 09:49 +1200, Richard MAHONEY wrote: Dear Robert et al., On Thu, 2007-04-12 at 07:15, Robert Tansley wrote: We considered this way back when (2001); we decided on using the filesystem because some files might be very very large, there might be lots of them and in general it's easier to split filesystem-based asset stores across multiple drives/machines than a big relational database. That said, the intention was that storage would be made pluggable -- so you could have RDBMS, SRB/iRODs, open-source GoogleFileSystem, LOCKSS-ish etc. storage. That pluggability ended up being one of the many non-critical-for-version-1 features we had to drop to get DSpace 1.0 finished :-) There are some projects (e.g. the MIT ones) looking at how to really accomplish this. Over the past few weeks I've been using Amazon's Simple Storage Service (S3): http://www.amazon.com/gp/browse.html?node=16427261 At this point I've merely been using it to backup web servers and development directories. This has involved the simple upload of compressed tarballs (using the Java app. jSh3ll) but also the synchronising of file systems (using the Ruby app. s3sync). In all, I've been pleasantly surprised by the results. It would seem that the S3 storage system promises to be more resilient than anything I could build at a reasonable cost. Although I've only been using S3 for remote backup, it seems that it can also be used as a live file system for storing and retrieving data for web apps. I am wondering then, if anyone, may be able to suggest how it might be possible to configure (cajole) DSpace-1.4 into using S3 as an assetstore. The Amazon blurb says that S3: `Uses standards-based REST and SOAP interfaces designed to work with any Internet-development toolkit.' Best regards, Richard MAHONEY -- Richard MAHONEY | internet: http://indica-et-buddhica.org/ Littledene | telephone/telefax (man.): +64 3 312 1699 Bay Road| cellular: +64 27 482 9986 OXFORD, NZ | email: [EMAIL PROTECTED] ~~~ Indica et Buddhica: Materials for Indology and Buddhology Repositorium: http://indica-et-buddhica.org/repositorium/ Philologica: http://indica-et-buddhica.org/philologica/ Subscriptions: http://subscriptions.indica-et-buddhica.org/ - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Assetstore physical storage (Amazon's Simple Storage Service: S3)
Dear Robert et al., On Thu, 2007-04-12 at 07:15, Robert Tansley wrote: We considered this way back when (2001); we decided on using the filesystem because some files might be very very large, there might be lots of them and in general it's easier to split filesystem-based asset stores across multiple drives/machines than a big relational database. That said, the intention was that storage would be made pluggable -- so you could have RDBMS, SRB/iRODs, open-source GoogleFileSystem, LOCKSS-ish etc. storage. That pluggability ended up being one of the many non-critical-for-version-1 features we had to drop to get DSpace 1.0 finished :-) There are some projects (e.g. the MIT ones) looking at how to really accomplish this. Over the past few weeks I've been using Amazon's Simple Storage Service (S3): http://www.amazon.com/gp/browse.html?node=16427261 At this point I've merely been using it to backup web servers and development directories. This has involved the simple upload of compressed tarballs (using the Java app. jSh3ll) but also the synchronising of file systems (using the Ruby app. s3sync). In all, I've been pleasantly surprised by the results. It would seem that the S3 storage system promises to be more resilient than anything I could build at a reasonable cost. Although I've only been using S3 for remote backup, it seems that it can also be used as a live file system for storing and retrieving data for web apps. I am wondering then, if anyone, may be able to suggest how it might be possible to configure (cajole) DSpace-1.4 into using S3 as an assetstore. The Amazon blurb says that S3: `Uses standards-based REST and SOAP interfaces designed to work with any Internet-development toolkit.' Best regards, Richard MAHONEY -- Richard MAHONEY | internet: http://indica-et-buddhica.org/ Littledene | telephone/telefax (man.): +64 3 312 1699 Bay Road| cellular: +64 27 482 9986 OXFORD, NZ | email: [EMAIL PROTECTED] ~~~ Indica et Buddhica: Materials for Indology and Buddhology Repositorium: http://indica-et-buddhica.org/repositorium/ Philologica: http://indica-et-buddhica.org/philologica/ Subscriptions: http://subscriptions.indica-et-buddhica.org/ - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
[Dspace-tech] Could DSpace use Amazon's Simple Storage Service (S3) as an assetstore?
Dear Subscribers, Over the past day or two I've been testing Amazon's Simple Storage Service (S3): http://www.amazon.com/gp/browse.html?node=16427261 At this point I've merely been using it to backup web servers and development directories. This has involved the simple upload of compressed tarballs (using the Java app. jSh3ll) but also the synchronising of file systems (using the Ruby app. s3sync). In all, I've been pleasantly surprised by the results. It would seem that the S3 storage system promises to be more resilient than anything I could build at a reasonable cost. Although I've only been using S3 for remote backup, it seems that it can also be used as a live file system for storing and retrieving data for web apps. I am wondering then, if anyone, may be able to suggest how it might be possible to configure (cajole) DSpace-1.4 into using S3 as an assetstore. The Amazon blurb says that S3: `Uses standards-based REST and SOAP interfaces designed to work with any Internet-development toolkit.' Best regards, Richard MAHONEY -- Richard MAHONEY | internet: http://indica-et-buddhica.org/ Littledene | telephone/telefax (man.): +64 3 312 1699 Bay Road| cellular: +64 27 482 9986 OXFORD, NZ | email: [EMAIL PROTECTED] ~~~ Indica et Buddhica: Materials for Indology and Buddhology Repositorium: http://indica-et-buddhica.org/repositorium/ Philologica: http://indica-et-buddhica.org/philologica/ Subscriptions: http://subscriptions.indica-et-buddhica.org/ - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Are you using the DSpace History System?
Dear Richard et al., On Wed, 2007-03-07 at 23:27, Richard Jones wrote: Hi Mark, This is a survey to see if anyone is actually utilizing the existing History System in production. If you are using the history system, would you please be kind and respond with a quick sentence on how your applying it? I am not currently using it, but bringing it up is timely because I'm reaching a point where what I am being drawn towards is the necessity of an audit tool for certain system activities. I haven't had time to evaluate what the history system can do for me in that regard, but if anyone is planning on making changes to it, I'd be interested in being involved in some way, shape, or form. Let me give you one or two examples of the kind of auditing that I need: as users add/remove files over time from their item as they prepare it, I need to track what was added/removed and by whom when (multiple users can work on a single item in our system). Similarly for licences. Also, administrators perform many tasks on items before they hit the public repository, and a navigable audit trail on item activities which can actually be interacted with would be of great benefit. A decent version control system for DSpace is a must. Not only should one be able to track the changes over time to each document, but one has to be able to consult and revert to previous versions. This functionality is taken for granted with any decent Enterprise Content Management (ECM) platform. In the hope that they might provide some pointers, I have uploaded a couple of screenshots of a document's `Status History' available within the IeB ECM system (powered by CPS-3.4/Zope): (http://indica-et-buddhica.org/sections/repositorium/desired-features/versioning-system) The current incarnation of CPS -- Nuxeo 5 -- is an open source Java app.. It is possible that some of the version control code could be modified by the DSpace community: (http://www.nuxeo.org/sections/about/) Best regards, Richard MAHONEY -- Richard MAHONEY | internet: http://indica-et-buddhica.org/ Littledene | telephone/telefax (man.): +64 3 312 1699 Bay Road| cellular: +64 27 482 9986 OXFORD, NZ | email: [EMAIL PROTECTED] ~~~ Indica et Buddhica: Materials for Indology and Buddhology Repositorium: http://indica-et-buddhica.org/repositorium/ Philologica: http://indica-et-buddhica.org/philologica/ Subscriptions: http://subscriptions.indica-et-buddhica.org/ - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] OS for DSpace
Hi Steve, On Thu, 2007-02-08 at 13:11, Steve Thomas wrote: we’ve been funded for new hardware for our Digital Library (yay!) and I’m now being asked what Operating system is required. The suggestion is Redhat EL 4. I’m sure that will be fine, but (being a Solaris person) I’d like reassurance, so … Well this certainly makes me curious ... Hope you don't mind me asking, but why -- given your [institutions?] experience with Solaris -- are you not planning on using Solaris 10, I'm assuming these new machines are x64 of some variety or other? Seems to me that Sun produces a reasonably decent platform for Java web apps and the like ;) Best regards, Richard Mahoney -- Richard MAHONEY | internet: http://indica-et-buddhica.org/ Littledene | telephone/telefax (man.): +64 3 312 1699 Bay Road| cellular: +64 27 482 9986 OXFORD, NZ | email: [EMAIL PROTECTED] --- Philologica: http://indica-et-buddhica.org/philologica/ Repositorium: http://indica-et-buddhica.org/repositorium/ - Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642 ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech