Hi Stuart
Also please see:
http://wiki.lib.sun.ac.za/index.php/SUNScholar/Digital_Signing
Cheers
hg
On 08/10/2010 20:31, Stuart Lewis wrote:
Hi Hilton,
- Assetstore: random structure causes large overhead on filesystem for no real
gain
Are you able to expand on the overhead that is caused, and from your profiling,
explain how the structure could be improved? My gut (and uniformed) instinct
would be that since asset store reads are completely random depending on the
items being viewed at the time, the layout of directories would be irrelevant.
Writes may be slightly less efficient, but since writes only tend to occur
once, they are of less consequence.
Apologies for sounding cryptic; I was trying not to be too verbose in the
template. :-)
This has mostly to do with back-ups. With about 600,000 files in random
directories, it can be hard to find out what files have changed. We implemented
an simple asset store structure that stores files by year/month/day. This means
we can mirror new files very quickly, and only traverse the entire assetstore
every other day to check if files have changed.
See: http://hdl.handle.net/10019.1/3161
How strange, I also proposed such a thing !!
I've just read this paper and have a question. You state the following:
----
At the moment, December 2009, the following two are the most widely used
software packages for building and maintaining institutional repositories
according the opendoar website.
• http://www.dspace.org with 502 installations.
• http://www.eprints.org with 261 installations.
The digital objects and store are located as follows for the above:
• DSpace => $DSPACE_HOME/assetstore
• EPrints => $EPRINTS_HOME/disk0
None of the above use a time/date based file system for storing digital
objects. None of them use UUID's to create unique digital
objects and stores.
In one hundred years time how can any of the above satisfy a future researcher
that the digital object is unique and has remained persistently so during the
years to 2109.
----
Are you able to expand for us your reasoning that repositories that do not use
datestamped directories and filenames containing UUIDs will not satisfy future
researchers?
Just because a file is stored in that location with a UUID makes it no more or
less likely that it has remained unique and persistent. Filenames alone cannot
guarantee this - it is up the repository to manage the integrity of the stored
items, and the wider system to ensure that this is the case. This is where the
notion of a 'trusted repository' comes into play - the fact the the repository
pltform and the system as a whole is trusted to have maintained the integrity
of the contents.
[A side note: You'll find a lot of the work that Tim has been leading recently
regarding AIPs is of interest in this area.
https://wiki.duraspace.org/display/DSPACE/AipBackupRestore ]
Cheers,
Stuart Lewis
IT Innovations Analyst and Developer
Te Tumu Herenga The University of Auckland Library
Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand
Ph: +64 (0)9 373 7599 x81928
--
Hilton Gibson
Systems Administrator
JS Gericke Library
Room 1053
Stellenbosch University
Private Bag X5036
Stellenbosch
7599
South Africa
Tel: +27 21 808 4100 | Cell: +27 84 646 4758
"Simplicity is the ultimate sophistication"
Leonardo da Vinci
------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3.
Spend less time writing and rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech