I second this motion. :) A unified "cloud deployment" tool would be absolutely great.
On Mon, May 5, 2014 at 1:34 PM, Matei Zaharia <[email protected]>wrote: > Very cool! Have you thought about sending this as a pull request? We’d be > happy to maintain it inside Spark, though it might be interesting to find a > single Python package that can manage clusters across both EC2 and GCE. > > Matei > > On May 5, 2014, at 7:18 AM, Akhil Das <[email protected]> wrote: > > Hi Sparkers, > > We have created a quick spark_gce script which can launch a spark cluster > in the Google Cloud. I'm sharing it because it might be helpful for someone > using the Google Cloud for deployment rather than AWS. > > Here's the link to the script > > https://github.com/sigmoidanalytics/spark_gce > > Feel free to use it and suggest any feedback around it. > > In short here's what it does: > > Just like the spark_ec2 script, this one also reads certain command-line > arguments (See the github page<https://github.com/sigmoidanalytics/spark_gce> > for > more details) like the cluster name and all, then starts the machines in > the google cloud, sets up the network, adds a 500GB empty disk to all > machines, generate the ssh keys on master and transfer it to all slaves and > install java and downloads and configures Spark/Shark/Hadoop. Also it > starts the shark server automatically. Currently the version is 0.9.1 but > I'm happy to add/support more versions if anyone is interested. > > > Cheers. > > > Thanks > Best Regards > > >
