On Thu, Nov 19, 2009 at 5:45 PM, Dan Yamins <[email protected]> wrote:
> Hi all: > > I'm just writing to report on my experience using Starcluster, which > enables the use of NumPy and Scipy in the Amazon EC2 cloud computing > environment. The purpose of my email is to extol Starcluster's qualities, > and suggest that the NumPy community be aware of its development. I > suspect there are others in the community who find cloud computing an > attractive idea but a little daunting to get into, > Thanks, Dan, this is me (for one), and I appreciate you making the time and effort to do this. If enough of us dive into this, perhaps we could/should start a numpig... DG > and would be pleasantly surprised out how easy Starcluster makes it to get > started using NumPy on Amazon EC2. > > For those of you who aren't familiar with AMIs and the Amazon EC2 service, > see e.g. http://en.wikipedia.org/wiki/Amazon_Elastic_Compute_Cloud. > Three of the basic concepts are "Amazon Machine Images" (AMIs), "machine > instances" of AMIs, and the Elastic Block Storage (EBS) service. AMIs are > disk images containing a virtual machine, including an operating system and > other software you add on. Instances are temporarily allocated computers, > booted with your chosen virtual machine, that you start up on demand, use > for computations with software from the AMI, and then terminate. EBS is a > persistent storage service, also from Amazon, that serves as permanent > file-systems in the cloud. You allocate an EBS volume of a given size, > attach the EBS volume(s) to a running machine instance just like any other > hard-drive, and use it to store the files you use/create during > computation, both during the computation and then for later use whenever you > start up a new instance. > > A couple of weeks ago I wrote to this list asking for advice on finding a > good Amazon Machine Instance (AMI) for using NumPy and Scipy on Amazon > cloud. I didn't want to have to build a linux machine image with optimized > blas and lapack myself, and I figured that there might be good existing > publicly-available AMIs that I could use as a base. Robert Kern suggested > that I look into the Starcluster project ( > http://web.mit.edu/stardev/cluster/). > > I have found Starcluster extremely useful. It made it possible for me to, > in the course of one day, go from knowing essentially nothing about cloud > something, to being able to run large-scale parallel clusters with my > favorite NumPy/SciPy-scripts. > > The basis of what Starcluster offers are two solidly-build AMIs. The > operating system is Ubuntu Jaunty, and comes with prebuilt optimized blas > and lapack, numpy, Scipy, matplotlib, ipython, and several other useful > packages for scientific computing in python. It uses Python 2.6, and comes > in both 32-bit and 64-bit flavors. The AMIs are based on AMIs from Alestic > (http://alestic.com/), and are built with best-practices for ensuring > stability and good interaction with Amazon's system. They have proved > very stable and extensible. > > In addition to these AMIs, Starcluster has three extremely useful features: > > -- Built-in support for mounting EBS drives as NFS filesystems**, and > then administering the shared drive across multiple machine instances. > -- The Sun Grid Engine (SGE), a queuing system for scheduling jobs to > be run in parallel across instances > -- A python module with a few commands that give you an incredibly > simple interface for automating the process of starting/terminating a > cluster of instances, mounting the shared drive, starting the grid engine, > &c -- and configuring your cluster needs (e.g. how many nodes it will > contain, which AMIs to use, which EBS volumes to mount etc.). > > As a result, all you have to do to have a NumPy-enabled cluster-on-demand > is: > 1) Get an amazon EC2 account, and the accompanying security credentials > (.501 certificates and PGP keypair) for your account. > 2) Install starcluster ("easy_install starcluster") > 3) Follow the installation procedure on the starcluster website for > getting, attaching, and formatting an EBS volume as an NFS drive. > 4) Set up your starcluster configuration file. > 5) Start a 1-node cluster, modify the installation as you see fit, and > re-bundle the result into a new AMI as described on the Amazon website > http://docs.amazonwebservices.com/AWSEC2/latest/GettingStartedGuide/. > (Don't forget to edit your starcluster configuration file to reflect your > new AMI.) This step is optional -- If you don't need anything else > special, you can just use Starcluster's base images. > > After that, starting a cluster is as easy as typing single command > ("starcluster -s"). To submit parallel jobs on your cluster, you can learn > to use the Sun Grid Engine "qsub" command ( > http://gridengine.sunsource.net/nonav/source/browse/~checkout~/gridengine/doc/htmlman/htmlman1/qsub.html<http://gridengine.sunsource.net/nonav/source/browse/%7Echeckout%7E/gridengine/doc/htmlman/htmlman1/qsub.html>) > or use the python bindings to the SGE interface ( > http://code.google.com/p/drmaa-python/). Or, if you like Parallel > Python, that works perfectly well on these clusters too. > > Overall, in my experience, Starcluster has been easy, stable and powerful, > and I encourage anyone who is curious about cloud computing with Numpy to > look into it. > > Starcluster is by no means a finished project. At the moment, you can only > administer one cluster at a time from your given local machine, since > starcluster has no notion of a "session" and it can't distinguish between > different clusters you've started up (you can *start* multiple clusters, > but then any starcluster commands that you type in your local terminal might > get confused about which amazon machine instances you're referring to, so it > has trouble administering them.) Also, there's no dynamic load balancing, > so once you've started a cluster with a certain number of nodes, you're > stuck with that number of computers while the cluster is running, even if > you're only using a few of them or suddenly need more. > > The developer of the project (*Justin Riley)* says on his website that > he's planning to add these features in the next release. Now, I'm not the > creator or developer or maintainer of Starcluster, and I have no affiliation > with Justin Riley or the project whatsoever, so I want to make it clear I > don't speak for them in any way except as a satisfied user. I don't know > what his commitment to his development plans are, either -- however, I hope > he sticks to his timeline, as I think continuing the vigorous development of > his project would be a real plus for the NumPy community. I'm hoping that > if others in the NumPy community like his project and start using it, that > will make add to the likelihood of continued development. (If anyone from > the NumPy community is interesting in helping the developer out, perhaps you > should consider shooting him an email.) > > Anyhow, I apologize for this long email, and hope it may be of use to > somebody! > > Dan > > > > > > > > > > _______________________________________________ > NumPy-Discussion mailing list > [email protected] > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
