Whirr's roll in an entire cloud based architecture deployment

John Conwell Wed, 05 Oct 2011 14:58:45 -0700

Hey guys,
Here are some thoughts I've been kicking around lately about whirr.


I've been using whirr fairly extensively since 0.4.0.  At first my needs
started off fairly simple, requiring only a single hadoop cluster.  Then
things got a bit more complex and I needed three different clusters (hadoop,
solr, cassandra), so I started using whirr's API, and built a bit of
automation around it.  And now my requirements have gotten fairly complex,
where I have 7 different kinds of clusters being created, and 3 times that
many post cluster launch steps to authorize ingress from one cluster to
another, run custom configuration scripts, copy required files to the
clusters, etc.

And this has brought me to the question, what do you think whirrs roll
should be when it comes to complex, interdependent cloud based architecture
deployment?  Whirr is really good at creating a single cluster of
non-dependent resources, meaning its good at creating a cluster of VMs dont
require any upstream dependencies in order for it to be used.  And this is
fine as long as there are no external dependencies.  But what about
deployment scenarios where there are N different types of clusters, and
where the configuration of one cluster is dependent on makeup of a previous
cluster?  Also, what about other kinds of deployment steps, like configuring
custom fire wall rules, or executing custom setup scripts.

For example, the scenario that I'm in the process of automating creates the
following clusters: hadoop, cassandra, solr, zookeeper, activemq, haproxy,
and two different tomcat clusters.  Then there are cluster to cluster
ingress rules I need to set, as well as a few ip address to cluster rules.
 But thats not the worst of it.  In order to fully configure our tomcat
servers for example, I need to know things like the ip addresses of the
cassandra, hadoop, solr, and activemq nodes.  So I've got custom steps that
gather this info and call runScriptOnNodesMatching on the tomcat cluster.
 Then there are external files that need to get put in certain clusters,
like custom solr config and schema files.  These I download form a
blobstore, again triggered from a script executed
by runScriptOnNodesMatching.

So in order to fully support complex cloud base deployments there are a set
of actions that need to get stitched together to execute is a specified
order in order to allow downstream dependencies to get info about up stream
deployment actions: launch cluster action, remote script action, cluster
ingress action, ip ingress action, file upload action, blob file upload,
etc, all hopefully driven by one configuration file that can define the
entire set of complex interdependent deployment actions.

Thoughts?

-- 

Thanks,
John C

Whirr's roll in an entire cloud based architecture deployment

Reply via email to