Hi, I really, really appreciate all the ideas that folks have put forward from both email lists. They have helped me to better define what the customer needs. This is a development and test cluster used for testing new algorithms and benchmarks for parallel R.
The cluster will run custom statistics software that does analysis on very large data sets by spreading the work across nodes and cores so they need multiple nodes in addition to multiple cores. The application is expected to be I/O bound as it will be moving files up to 400GB around between the nodes and between the nodes and the permanent storage. The data on the nodes is "semi-permanent" which means I need to mirror the disks so a disk failure will not result in data loss. So I am looking at: - 5 nodes with (1) quad newhalem, 8 - 16GB ram, either mirrored 6Gb/s sata disks or multiple sas/scsi disks in a mirrored stripped setup. - two switches - one for data to SAN and controller, one for interconnects between nodes - rhel/rocks and windows/hpc for the os set up as dual boot on each node. This means that the developers will need to reconfigure the nodes by hand which is ok - single controller machine running esxi with 2 virtual machines - one is windows/hpc and one is rhel/rocks. That way both the controller machines can run all the time. - some sort of san or nas to provide shared space I am not looking at redundant power supplies as this is a dev/test cluster. I do plan on checking into Silicon Mechanics and Dell. It looks like separate servers are a cheaper way to go than a blade system. Thanks again for everyone's feedback. cheers, ski -- "When we try to pick out anything by itself, we find it connected to the entire universe" John Muir Chris "Ski" Kacoroski, [email protected], 206-501-9803 or ski98033 on most IM services _______________________________________________ Tech mailing list [email protected] http://lopsa.org/cgi-bin/mailman/listinfo/tech This list provided by the League of Professional System Administrators http://lopsa.org/
