Hello All,

I'm writing to share the initial results of a DSpace stress test we are 
performing, and to ask for your thoughts and suggestions as we begin 
ramping up for the next round of tests.

http://sites.google.com/a/ohiolink.edu/drmc/Home/stretch-armstrong

Background: At OhioLINK we've built a federation of DSpace instances 
across the state called the Digital Resource Commons 
(http://drc.ohiolink.edu), and we hope to expand our offerings beyond 
the academic library community. As part of that research, we are 
building a multi-million record test of DSpace using Amazon's Elastic 
Compute Cloud (EC2).

Item Import: Like many of you, we noticed longer batch submission times 
as our repository grew beyond 200,000 items and knew we needed to find a 
solution.

Our first goal was to confirm the results of the ROAD Project 
(http://www.jisc.ac.uk/whatwedo/programmes/reppres/tools/road.aspx) test 
mentioned by Stewart Lewis. In Stewart's scenario, the entire 300,000 
record submission took place at one time. We wanted to see if the 
problem exists even when the submission is broken up into several 
smaller blocks and takes place over a period of days.

Our initial data confirm and extend the results Stewart posted, 'DSpace 
at a Third of a Million Items.'  
http://blog.stuartlewis.com/wp-content/uploads/2009/01/dspace-banding.png
While he was interested in the performance of the SWORD client, his 
experiment shows a steady increase in the time-to-ingest for a single 
batch submission of 300,000 records. We've confirmed the problem rests 
with the submission process itself, and is not just an issue with 
extremely large 'one shot' batches.

Special thanks to Stewart (Auckland University), Tom De Mulder and Simon 
Brown (Cambridge University)  for their early comments and assistance.

Please feel free to send along your own insights and suggestions,

John Davison
Assistant Director
Digital Resource Development

------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to