Thanks! That's very helpful.

On Wed, Jan 27, 2016 at 3:33 PM, Nicholas Chammas <
nicholas.cham...@gmail.com> wrote:

> I noticed that in the main branch, the ec2 directory along with the
> spark-ec2 script is no longer present.
>
> It’s been moved out of the main repo to its own location:
> https://github.com/amplab/spark-ec2/pull/21
>
> Is spark-ec2 going away in the next release? If so, what would be the best
> alternative at that time?
>
> It’s not going away. It’s just being removed from the main Spark repo and
> maintained separately.
>
> There are many alternatives like EMR, which was already mentioned, as well
> as more full-service solutions like Databricks. It depends on what you’re
> looking for.
>
> If you want something as close to spark-ec2 as possible but more actively
> developed, you might be interested in checking out Flintrock
> <https://github.com/nchammas/flintrock>, which I built.
>
> Is there any way to add/remove additional workers while the cluster is
> running without stopping/starting the EC2 cluster?
>
> Not currently possible with spark-ec2 and a bit difficult to add. See:
> https://issues.apache.org/jira/browse/SPARK-2008
>
> For 1, if no such capability is provided with the current script., do we
> have to write it ourselves? Or is there any plan in the future to add such
> functions?
>
> No "official" plans to add this to spark-ec2. It’s up to a contributor to
> step up and implement this feature, basically. Otherwise it won’t happen.
>
> Nick
>
> On Wed, Jan 27, 2016 at 5:13 PM Alexander Pivovarov <apivova...@gmail.com>
> wrote:
>
> you can use EMR-4.3.0 run on spot instances to control the price
>>
>> yes, you can add/remove instances to the cluster on fly  (CORE instances
>> support add only, TASK instances - add and remove)
>>
>>
>>
>> On Wed, Jan 27, 2016 at 2:07 PM, Sung Hwan Chung <
>> coded...@cs.stanford.edu> wrote:
>>
>>> I noticed that in the main branch, the ec2 directory along with the
>>> spark-ec2 script is no longer present.
>>>
>>> Is spark-ec2 going away in the next release? If so, what would be the
>>> best alternative at that time?
>>>
>>> A couple more additional questions:
>>> 1. Is there any way to add/remove additional workers while the cluster
>>> is running without stopping/starting the EC2 cluster?
>>> 2. For 1, if no such capability is provided with the current script., do
>>> we have to write it ourselves? Or is there any plan in the future to add
>>> such functions?
>>> 2. In PySpark, is it possible to dynamically change driver/executor
>>> memory, number of cores per executor without having to restart it? (e.g.
>>> via changing sc configuration or recreating sc?)
>>>
>>> Our ideal scenario is to keep running PySpark (in our case, as a
>>> notebook) and connect/disconnect to any spark clusters on demand.
>>>
>>
>> ​
>

Reply via email to