Very cool! Have you thought about sending this as a pull request? We’d be happy to maintain it inside Spark, though it might be interesting to find a single Python package that can manage clusters across both EC2 and GCE.
Matei On May 5, 2014, at 7:18 AM, Akhil Das <[email protected]> wrote: > Hi Sparkers, > > We have created a quick spark_gce script which can launch a spark cluster in > the Google Cloud. I'm sharing it because it might be helpful for someone > using the Google Cloud for deployment rather than AWS. > > Here's the link to the script > > https://github.com/sigmoidanalytics/spark_gce > > Feel free to use it and suggest any feedback around it. > > In short here's what it does: > > Just like the spark_ec2 script, this one also reads certain command-line > arguments (See the github page for more details) like the cluster name and > all, then starts the machines in the google cloud, sets up the network, adds > a 500GB empty disk to all machines, generate the ssh keys on master and > transfer it to all slaves and install java and downloads and configures > Spark/Shark/Hadoop. Also it starts the shark server automatically. Currently > the version is 0.9.1 but I'm happy to add/support more versions if anyone is > interested. > > > Cheers. > > > Thanks > Best Regards
