The requirements are a little vague. But then, "entry level system" and "mid-sized university" are a bit fuzzy too. :-) So is a term I'll probably throw in: "server-class box".
It's really hard to pin down precise requirements. A small but very popular collection might need more machine than a huge collection with a very select clientele. I think that the best you are going to get is some examples. Here we run two production DSpace hosts. One has a single DSpace instance and its DBMS, in 2GB of memory on dual Xeon 2.4gHz (hyperthreaded). It contains 4570 ORIGINAL bitstreams (primary documents, not thumbnails or extracted text or licenses) in 1717 items. The DBMS occupies 1.7gB and the assetstore 5.4gB. (The DBMS is also providing two other databases in the same tablespace, so it's hard to say precisely how much is used by ScholarWorks.IUPUI.Edu). This one is our institutional repository and contains mainly local research output. The memory is comfortably full and performance is unremarkable. We have two gigabit Ethernet links from this host to The World. The other host runs three DSpace instances and their DBMS. It has 3GB of memory and dual Xeon 3gHz processors (hyperthreaded). The DBMS occupies 6.5GB and the three assetstores about 18GB. The largest instance contains (from memory) about 20,000 documents and has a sizable international audience; the other two are considerably smaller. One instance is our university archive (meeting minutes and such) and would be of limited interest outside the organization. We've had some performance problems, mainly due to memory pressure and my inexperience in tuning a system for sizable Java app.s. If I were sizing this system today I would recommend at least 4GB of memory. I'm also considering consolidating the databases on a single, separate host. This host also runs two GbE links to the outside. I'm also happily running more than a dozen test instances on a 4GB dual Opteron box. All of these are using hardware RAID-5 storage, via whatever HP StorageWorks or Dell PERC controller came with the system. I don't think I've got the DBMS tuned well enough yet to say whether there are significant performance limits from that setup, but a lot of Postgres folk prefer software RAID and definitely like other controllers for high-performance DBMS servers. I don't expect to see such limits hit unless our traffic goes up considerably. -- Mark H. Wood, Lead System Programmer [email protected] Friends don't let friends publish revisable-form documents.
pgp5r35bsonW6.pgp
Description: PGP signature
_______________________________________________ Dspace-general mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/dspace-general
