Unfortunately, that script is not under active maintenance. Given that spark is getting accelerated release cycles, solutions like this get outdated quickly.
On Fri, Feb 28, 2014 at 7:36 PM, Mayur Rustagi <[email protected]>wrote: > Thr is a talk to install spark on Amazon ( not sure if its updated for > 0.9.0). > http://www.youtube.com/watch?v=G0lSWUqyOhw > In this case the bootstrap script will run on the new slave when it comes > up. I am not sure how clean & production quality this is. He seems to be > leveraging spot instances where this needs to be done properly. > > Mayur Rustagi > Ph: +1 (760) 203 3257 > http://www.sigmoidanalytics.com > @mayur_rustagi <https://twitter.com/mayur_rustagi> > > > > On Fri, Feb 28, 2014 at 10:52 AM, Aureliano Buendia > <[email protected]>wrote: > >> Also, in this talk http://www.youtube.com/watch?v=OhpjgaBVUtU on using >> spark streaming in production, the author seems to have missed the topic of >> how to manage cloud instances. >> >> >> On Fri, Feb 28, 2014 at 6:48 PM, Aureliano Buendia >> <[email protected]>wrote: >> >>> What's the updated way of deploying spark streaming apps on EMR? Using >>> YARN? >>> >>> There are some out of date solutions like >>> https://github.com/ianoc/SparkEMRBootstrap which setup mesos on EMR. I >>> wonder if this can be simplified by spark 0.9. >>> >>> Spark-ec2 comes with a considerable amount of configuration, and some >>> useful utilities like deploy to workers, porting it to a managed service >>> such as EMR is not as trivial as it might seem to be. >>> >>> >>> On Fri, Feb 28, 2014 at 6:19 PM, Mayur Rustagi >>> <[email protected]>wrote: >>> >>>> I think what you are looking for is sort of a managed service ala EMR >>>> or Qubole. Spark-ec2 is just software to boot up machines & integrate them >>>> together using Whirr. >>>> I agree a managed service for Streaming would be really useful. >>>> Regards >>>> Mayur >>>> >>>> Mayur Rustagi >>>> Ph: +1 (760) 203 3257 >>>> http://www.sigmoidanalytics.com >>>> @mayur_rustagi <https://twitter.com/mayur_rustagi> >>>> >>>> >>>> >>>> On Fri, Feb 28, 2014 at 8:50 AM, Aureliano Buendia < >>>> [email protected]> wrote: >>>> >>>>> Another subject that was not that important in spark, but it could be >>>>> crucial for 24/7 spark streaming, is reconstruction of lost nodes. By >>>>> that, >>>>> I do not mean lost data reconstruction by self healing, but bringing up >>>>> new >>>>> ec2 instances once they die for whatever reasons. Is this also supported >>>>> in >>>>> spark ec2? >>>>> >>>>> >>>>> On Fri, Feb 28, 2014 at 2:24 AM, Tathagata Das < >>>>> [email protected]> wrote: >>>>> >>>>>> Yes, the default spark EC2 cluster runs the standalone deploy mode. >>>>>> Since Spark 0.9, the standalone deploy mode allows you to launch the >>>>>> driver >>>>>> app within the cluster itself and automatically restart it if it fails. >>>>>> You >>>>>> can read about launching your app inside the cluster >>>>>> here<http://spark.incubator.apache.org/docs/latest/spark-standalone.html#connecting-an-application-to-the-cluster>. >>>>>> Using this you can launch your streaming app as well. >>>>>> >>>>>> TD >>>>>> >>>>>> >>>>>> On Thu, Feb 27, 2014 at 5:35 PM, Aureliano Buendia < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> How about spark stream app itself? Does the ec2 script also provide >>>>>>> means for daemonizing and monitoring spark streaming apps which are >>>>>>> supposed to run 24/7? If not, any suggestions for how to do this? >>>>>>> >>>>>>> >>>>>>> On Thu, Feb 27, 2014 at 8:23 PM, Tathagata Das < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Zookeeper is automatically set up in the cluster as Spark uses >>>>>>>> Zookeeper. However, you have to setup your own input source like Kafka >>>>>>>> or >>>>>>>> Flume. >>>>>>>> >>>>>>>> TD >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Feb 27, 2014 at 10:32 AM, Aureliano Buendia < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Feb 27, 2014 at 6:17 PM, Tathagata Das < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Yes! Spark streaming programs are just like any spark program and >>>>>>>>>> so any ec2 cluster setup using the spark-ec2 scripts can be used to >>>>>>>>>> run >>>>>>>>>> spark streaming programs as well. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Great. Does it come with any input source support as well? (Eg >>>>>>>>> kafka requires setting up zookeeper). >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Feb 27, 2014 at 10:11 AM, Aureliano Buendia < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> Does the ec2 support for spark 0.9 also include spark streaming? >>>>>>>>>>> If not, is there an equivalent? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
