Thank you Terry. How fast do your DSpace grow? How many items per month
or year? Do you do clustering / load balancing? What kind of hardware do
you need to run it? I would be grateful if you can share those information.

Vlastik

On 8/23/19 6:28 PM, Terry Brady wrote:
> Here are some details about DigitalGeorgetown.
> 
>   * Total items: 546,000
>   * Public items: 397,000
>   * Citation only items: ~470,000
> 
> As we tested and migrated to DSpace 6x, we did encounter a few
> performance issues.  We have contributed patches to DSpace 6x releases
> (and to the future DSpace 6.4 release) to help resolve these issues.
> 
> We preserve our assets in the APTrust (Academic Preservation Trust)
> service, so we do not run the DSpace checksum checker on our DSpace
> instance. 
> 
> Terry
> 
> On Fri, Aug 23, 2019 at 7:48 AM Tim Donohue <[email protected]
> <mailto:[email protected]>> wrote:
> 
>     Hello Vlastimil,
> 
>     Unfortunately, the size of DSpace sites is very difficult to track
>     overall (it relies entirely on self reporting).  
> 
>     I know there are very large sites out there... a few that come to
>     mind are U of Cambridge (https://www.repository.cam.ac.uk
>     <https://www.repository.cam.ac.uk/>), and Georgetown University
>     (https://repository.library.georgetown.edu/).  I cannot claim to
>     know exactly how large the sites are though, as each of these sites
>     may have access restricted content (which is not even visible on the
>     web).  However, in terms of public content alone each has 250-350
>     thousand items.
> 
>     I also admit that I don't know whether there are larger sites out
>     there.  But, maybe institutions on this mailing list will
>     self-report if they have more than 400 thousand items. (I know I'd
>     love to hear which sites have >400K items!)
> 
>     I think Mark Wood gave a thorough answer regarding the number of
>     items possible in a DSpace.  Technically, the biggest limitation is
>     the amount of server space & memory available (as larger sites need
>     more of each).  For each release we attempt to make DSpace as
>     performant (and memory lean) as we can, and as memory issues are
>     reported we resolve them as bugs in a new release.  For example, for
>     the upcoming DSpace 7 release (which is still under active
>     development) we are running more detailed performance testing as
>     detailed
>     here: 
> https://wiki.duraspace.org/display/DSPACE/DSpace+7+Performance+Testing 
>      At this time, that performance testing is more geared towards
>     minimizing CPU load and memory overall (which will also help in
>     scaling).
> 
>     Tim
> 
>     ------------------------------------------------------------------------
>     *From:* [email protected]
>     <mailto:[email protected]>
>     <[email protected]
>     <mailto:[email protected]>> on behalf of Vlastimil
>     Krejčíř <[email protected] <mailto:[email protected]>>
>     *Sent:* Friday, August 23, 2019 5:57 AM
>     *To:* DSpace Community <[email protected]
>     <mailto:[email protected]>>
>     *Subject:* [dspace-community] Scalability of DSpace
>      
>     Hi all,
> 
>     back in April 2013 I asked the community about the DSpace
>     scalability, see:
> 
>     
> http://dspace.2283337.n4.nabble.com/DSpace-scalability-tens-of-hundreds-TBs-tt4662988.html#a4663047
> 
>     Now, at 2019, it is time to ask the same question :-).
> 
>     How much data / how many items can DSpace handle? The DSpace system
>     at Cambridge University (https://www.repository.cam.ac.uk/) was
>     reported as the largest then. I can see it stores about 245
>     thousands of items nowadays.
> 
>     Does anyone else have bigger one? Are there new information on
>     scalability since 2013?
> 
>     Regards,
> 
>     Vlastik Krejčíř
> 
>     --
>     
> ----------------------------------------------------------------------------
>     Vlastimil Krejčíř
>     Library and Information Centre, Institute of Computer Science
>     Masaryk University, Brno, Czech Republic
>     Email: krejcir (at) ics (dot) muni (dot) cz
>     Phone: +420 549 49 3872
>     OpenPGP key: https://kic-internal.ics.muni.cz/~krejvl/pgp/
>     Fingerprint: 7800 64B2 6E20 645B 56AF  C303 34CB 1495 C641 11B9
>     
> ----------------------------------------------------------------------------
> 
>     -- 
>     All messages to this mailing list should adhere to the DuraSpace
>     Code of Conduct: https://duraspace.org/about/policies/code-of-conduct/
>     ---
>     You received this message because you are subscribed to the Google
>     Groups "DSpace Community" group.
>     To unsubscribe from this group and stop receiving emails from it,
>     send an email to [email protected]
>     <mailto:[email protected]>.
>     To view this discussion on the web visit
>     
> https://groups.google.com/d/msgid/dspace-community/a37b7af1-59eb-4a7e-b302-196cadbed7a0%40googlegroups.com
>     
> <https://groups.google.com/d/msgid/dspace-community/a37b7af1-59eb-4a7e-b302-196cadbed7a0%40googlegroups.com?utm_medium=email&utm_source=footer>.
> 
>     -- 
>     All messages to this mailing list should adhere to the DuraSpace
>     Code of Conduct: https://duraspace.org/about/policies/code-of-conduct/
>     ---
>     You received this message because you are subscribed to the Google
>     Groups "DSpace Community" group.
>     To unsubscribe from this group and stop receiving emails from it,
>     send an email to [email protected]
>     <mailto:[email protected]>.
>     To view this discussion on the web visit
>     
> https://groups.google.com/d/msgid/dspace-community/DM5PR22MB05727332D082F1B9BEB443BCEDA40%40DM5PR22MB0572.namprd22.prod.outlook.com
>     
> <https://groups.google.com/d/msgid/dspace-community/DM5PR22MB05727332D082F1B9BEB443BCEDA40%40DM5PR22MB0572.namprd22.prod.outlook.com?utm_medium=email&utm_source=footer>.
> 
> 
> 
> -- 
> Terry Brady
> Applications Programmer Analyst
> Georgetown University Library Information Technology
> https://github.com/terrywbrady/info
> 425-298-5498 (Seattle, WA)
> 
> -- 
> All messages to this mailing list should adhere to the DuraSpace Code of
> Conduct: https://duraspace.org/about/policies/code-of-conduct/
> ---
> You received this message because you are subscribed to the Google
> Groups "DSpace Community" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to [email protected]
> <mailto:[email protected]>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/dspace-community/CAMp2YEwjrRz7B%2B%2BXtyC0gV-gW90aukC5o3s2o%2B9pf4y5wE_uZA%40mail.gmail.com
> <https://groups.google.com/d/msgid/dspace-community/CAMp2YEwjrRz7B%2B%2BXtyC0gV-gW90aukC5o3s2o%2B9pf4y5wE_uZA%40mail.gmail.com?utm_medium=email&utm_source=footer>.

-- 
All messages to this mailing list should adhere to the DuraSpace Code of 
Conduct: https://duraspace.org/about/policies/code-of-conduct/
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-community/15980bcc-7f2e-9b95-e6a3-6b9777b43332%40ics.muni.cz.

Reply via email to