Hi Vlastik, The extent to which DSpace will scale will also depend a lot on the usage of the repository. For example if it is to be used as a management tool with very little access, then it will scale further than if you plan on having many simultaneous users all interacting with the contents.
There are also options for 'scaling out' the repository, depending on your planned usage patterns. For example if there would be a lot of 'reads' of items, then you can install multiple front end servers, and replicate the solr search indexes. One front end server could be configured to allow logins, whilst all the others have logins disabled, and are restricted to read-only operations. Other parts of the infrastructure, such as the database (postgres / oracle) will also have their own methods of being scaled up and out. If you do decide to use DSpace in this fashion, or indeed any system, you will probably need to invest a reasonable amount of time in tuning the system for performance. If you learn any lessons from this, the DSpace community would benefit greatly if you were happy to share them. Best wishes, Stuart Lewis Head of Research and Learning Services Deputy Director Library & University Collections, Information Services University of Edinburgh [email protected] On 08/04/2013 16:24, "Tim Donohue" <[email protected]> wrote: Hi Vlastik, This had slipped my mind, but there was some scalability testing by U of Cambridge in 2010. They had tested with DSpace 1.6.2. At the time they ran into scalability/memory issues, when loading DSpace 1.6.2 with 12 TB worth of data http://dspace.2283337.n4.nabble.com/Dspace-tech-Scalability-issues-report-D Space-Cambridge-td3287701.html However, based on Cambridge's reported issues, we performed many scalability/memory usage enhancements in DSpace 1.7.0 (and Cambridge had verified those resolved their issues -- cannot seem to track down that email though). More notes on the performance improvements in 1.7.0 are on our 1.7.0 Release notes: https://wiki.duraspace.org/display/DSPACE/DSpace+Release+1.7.0+Notes Since then, we've kept a closer watch for possible memory leaks. I cannot guarantee we've caught all of them, but if any are noticed, we'd gladly try to resolve them ASAP. U of Cambridge is one of the larger (known) DSpace instances. I'm not sure how much data they currently have. But, at least in 2010 they said they had around 12TB (200K items). - Tim On 4/8/2013 9:49 AM, Tim Donohue wrote: > Hi Vlastik, > > Unfortunately, as far as I'm aware there are no DSpace installations > with many TBs worth of data. (If anyone out there is running DSpace with > large amounts of data, we'd definitely love to hear from your >experiences!) > > I'd hope that DSpace could scale to that level. But, to be completely > honest, we've never had anyone attempt it. However, should you have the > resources to do this sort of scalability testing, we'd definitely > appreciate feedback on any issues you run into (if any). > > We do our best to ensure that DSpace is scalable. But, as we are a team > of volunteers, we don't always have the resources to do extensively > large scalability testing (and therefore, we are forced to depend on the > community to help report such issues to us). However, we'd do our best > to help resolve any issues you'd encounter -- we've worked with others > in the past when they've noticed scalability or memory leak issues in > DSpace. > > If you were to encounter issues, it'd likely be memory related issues. > In recent releases we've done some work to plug some longer standing > memory leaks. But, I cannot guarantee we've located them all. Again > though, this is something we'd love feedback on -- we'd want to fix > memory leaks as quickly as we can. > > I'm not sure if that helps or not. > > - Tim > > On 4/4/2013 8:44 AM, Vlastimil Krejcir wrote: >> Hello all, >> >> I have been recently ask the question on DSpace scalability - >> assume the >> project: >> >> 16 millions of items (bistreams size about 230 TB) increasing by 3 >> millions items (86 TB) per year >> >> Is DSpace able to handle this? My answer was I don't know. Is anyone >> working with such big loads of data? What is your opinion? >> >> Regards, >> >> Vlastik >> >> >>------------------------------------------------------------------------- >>--- >> >> Vlastimil Krejčíř >> Library and Information Centre, Institute of Computer Science >> Masaryk University, Brno, Czech Republic >> Email: krejcir (at) ics (dot) muni (dot) cz >> Phone: +420 549 49 3872 >> ICQ: 163963217 >> Jabber: [email protected] >> >>------------------------------------------------------------------------- >>--- >> >> >> >>------------------------------------------------------------------------- >>----- >> >> Minimize network downtime and maximize team effectiveness. >> Reduce network management and security costs.Learn how to hire >> the most talented Cisco Certified professionals. Visit the >> Employer Resources Portal >> http://www.cisco.com/web/learning/employer_resources/index.html >> _______________________________________________ >> Dspace-general mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/dspace-general >> --------------------------------------------------------------------------- --- Minimize network downtime and maximize team effectiveness. Reduce network management and security costs.Learn how to hire the most talented Cisco Certified professionals. Visit the Employer Resources Portal http://www.cisco.com/web/learning/employer_resources/index.html _______________________________________________ Dspace-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-general -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ------------------------------------------------------------------------------ Minimize network downtime and maximize team effectiveness. Reduce network management and security costs.Learn how to hire the most talented Cisco Certified professionals. Visit the Employer Resources Portal http://www.cisco.com/web/learning/employer_resources/index.html _______________________________________________ Dspace-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-general
