Unfortunately, that script is not under active maintenance. Given that
spark is getting accelerated release cycles, solutions like this get
outdated quickly.


On Fri, Feb 28, 2014 at 7:36 PM, Mayur Rustagi <[email protected]>wrote:

> Thr is a talk to install spark on Amazon ( not sure if its updated for
> 0.9.0).
> http://www.youtube.com/watch?v=G0lSWUqyOhw
> In this case the bootstrap script will run on the new slave when it comes
> up. I am not sure how clean & production quality this is. He seems to be
> leveraging spot instances where this needs to be done properly.
>
> Mayur Rustagi
> Ph: +1 (760) 203 3257
> http://www.sigmoidanalytics.com
> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>
>
>
> On Fri, Feb 28, 2014 at 10:52 AM, Aureliano Buendia 
> <[email protected]>wrote:
>
>> Also, in this talk http://www.youtube.com/watch?v=OhpjgaBVUtU on using
>> spark streaming in production, the author seems to have missed the topic of
>> how to manage cloud instances.
>>
>>
>> On Fri, Feb 28, 2014 at 6:48 PM, Aureliano Buendia 
>> <[email protected]>wrote:
>>
>>> What's the updated way of deploying spark streaming apps on EMR? Using
>>> YARN?
>>>
>>> There are some out of date solutions like
>>> https://github.com/ianoc/SparkEMRBootstrap which setup mesos on EMR. I
>>> wonder if this can be simplified by spark 0.9.
>>>
>>> Spark-ec2 comes with a considerable amount of configuration, and some
>>> useful utilities like deploy to workers, porting it to a managed service
>>> such as EMR is not as trivial as it might seem to be.
>>>
>>>
>>> On Fri, Feb 28, 2014 at 6:19 PM, Mayur Rustagi 
>>> <[email protected]>wrote:
>>>
>>>> I think what you are looking for is sort of a managed service ala EMR
>>>> or Qubole. Spark-ec2 is just software to boot up machines & integrate them
>>>> together using Whirr.
>>>> I agree a managed service for Streaming would be really useful.
>>>> Regards
>>>> Mayur
>>>>
>>>> Mayur Rustagi
>>>> Ph: +1 (760) 203 3257
>>>> http://www.sigmoidanalytics.com
>>>> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>>>>
>>>>
>>>>
>>>> On Fri, Feb 28, 2014 at 8:50 AM, Aureliano Buendia <
>>>> [email protected]> wrote:
>>>>
>>>>> Another subject that was not that important in spark, but it could be
>>>>> crucial for 24/7 spark streaming, is reconstruction of lost nodes. By 
>>>>> that,
>>>>> I do not mean lost data reconstruction by self healing, but bringing up 
>>>>> new
>>>>> ec2 instances once they die for whatever reasons. Is this also supported 
>>>>> in
>>>>> spark ec2?
>>>>>
>>>>>
>>>>> On Fri, Feb 28, 2014 at 2:24 AM, Tathagata Das <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Yes, the default spark EC2 cluster runs the standalone deploy mode.
>>>>>> Since Spark 0.9, the standalone deploy mode allows you to launch the 
>>>>>> driver
>>>>>> app within the cluster itself and automatically restart it if it fails. 
>>>>>> You
>>>>>> can read about launching your app inside the cluster 
>>>>>> here<http://spark.incubator.apache.org/docs/latest/spark-standalone.html#connecting-an-application-to-the-cluster>.
>>>>>> Using this you can launch your streaming app as well.
>>>>>>
>>>>>> TD
>>>>>>
>>>>>>
>>>>>> On Thu, Feb 27, 2014 at 5:35 PM, Aureliano Buendia <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> How about spark stream app itself? Does the ec2 script also provide
>>>>>>> means for daemonizing and monitoring spark streaming apps which are
>>>>>>> supposed to run 24/7? If not, any suggestions for how to do this?
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Feb 27, 2014 at 8:23 PM, Tathagata Das <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Zookeeper is automatically set up in the cluster as Spark uses
>>>>>>>> Zookeeper. However, you have to setup your own input source like Kafka 
>>>>>>>> or
>>>>>>>> Flume.
>>>>>>>>
>>>>>>>> TD
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Feb 27, 2014 at 10:32 AM, Aureliano Buendia <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Feb 27, 2014 at 6:17 PM, Tathagata Das <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Yes! Spark streaming programs are just like any spark program and
>>>>>>>>>> so any ec2 cluster setup using the spark-ec2 scripts can be used to 
>>>>>>>>>> run
>>>>>>>>>> spark streaming programs as well.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Great. Does it come with any input source support as well? (Eg
>>>>>>>>> kafka requires setting up zookeeper).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Feb 27, 2014 at 10:11 AM, Aureliano Buendia <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> Does the ec2 support for spark 0.9 also include spark streaming?
>>>>>>>>>>> If not, is there an equivalent?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to