On 26 Jan 2010, at 19:37, Paul Van Allsburg wrote:
> Ashley Pittman wrote:
>> On 25 Jan 2010, at 15:28, Jonathan Aquilina wrote:
>>> has anyone tried clustering using xen based vm's. what is everyones take on 
>>> that? its something that popped into my head while in my lectures today.
>>>    
>> 
>> I've been using Amazon ec2 for clustering for months now, from a software 
>> perspective it's very similar to running real hardware.  For my needs 
>> (development) it's perfectly adequate, I've not benchmarked it against 
>> running the same code on the raw hardware though.
> 
> I'd love to try clustering on Amazon.

It's really easy.

> Is there a good writeup somewhere on how to configure & use mpi in the cloud?


I'm not sure one is needed.  As a bit of background I develop and support an 
open source debugging tool for parallel applications (see my sig for details), 
as such I run a lot of parallel apps but I run them purely to have something to 
test padb against hence I'm not bothered about performance, I just need a 
running job to interrogate.  What is important for me (or rather my tool) is 
that it works in different environments so I run with a variety of clustering 
software.

With Amazon I can boot any numbers of machine "instances" and pay $0.85c/h for 
each one, typically I run four at a time but I've run with up to twenty.  Once 
the instances are booted there is no difference between using them and using 
real machines.  I regularly use Slurm, OpenMPI (ORTE and under Slurm), MPICH2 
(mpd, hydra and under slurm) and I've yet to find any way in which the setup 
differs from running on real metal.  For persistent storage I pay for a 'EBS' 
volume which I attach to one vm and nfs export to the others which use as a 
shared /home, each instance also comes with a large scratch partition but I 
typically don't use this at all.  I have a bunch of scripts for populating the 
hosts files and adding user accounts and that's all there is to it.  For the 
EBS volume you simply pick the size you need, create the volume, attach it to a 
vm and them mkfs.ext3 as normal, this volume is persistent and is charged for 
by Gb by calendar month rather than instance hour.

I can also choose what distro and indeed OS to run, the default is FC8 but it's 
easy enough to pick something else, I tend to flip between FC8, debian and 
Solaris every few weeks, this is mostly to ensure my code is well tested in 
different machines - it does mean re-compiling everything each time I switch 
which can take a while.

I also noticed that over-committing virtual machines doesn't have the same 
negative impact as over-commiting the CPU's on virtual machines, sure the 
application performance plummets in either case but the virtual machine is 
still usable where as a real machine can stop responding almost completely.  
This means I can over-commit my vm's by running 32 procs per node and run 512 
process jobs at a cost of only $1.36 an hour.  Cheap enough to be able to try 
something, see if it works and not have to worry about the cost.

In short, Amazon makes a really good development or test system for small scale 
clusters, it's good for testing code correctness and experimenting with 
different distos.  I'm not convinced about the performance and I'm not 
convinced about the cost effectiveness or larger or longer running applications 
but as a place to start it's ideal.

Ashley,

-- 

Ashley Pittman, Bath, UK.

Padb - A parallel job inspection tool for cluster computing
http://padb.pittman.org.uk


_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to