If I recall correctly, there is no additional premium for using EMR unless you 
use one of the MapR distributions they offer, or the other value adds.

So a vanilla EMR cluster with spot instances will be no different cost than 
using spark-ec2.

Sent from my iPhone

> On 28 Jan 2016, at 01:34, Sung Hwan Chung <coded...@cs.stanford.edu> wrote:
> 
> Hm thanks,
> 
> I think what you are suggesting sounds like a recommendation for AWS EMR. 
> However, my questions were wrt spark-ec2. For our uses involving 
> spot-instances, EMR could potentially double/triple prices due to the 
> additional premiums.
> 
> Thanks anyway!
> 
>> On Wed, Jan 27, 2016 at 2:12 PM, Alexander Pivovarov <apivova...@gmail.com> 
>> wrote:
>> you can use EMR-4.3.0 run on spot instances to control the price
>> 
>> yes, you can add/remove instances to the cluster on fly  (CORE instances 
>> support add only, TASK instances - add and remove)
>> 
>> 
>> 
>>> On Wed, Jan 27, 2016 at 2:07 PM, Sung Hwan Chung <coded...@cs.stanford.edu> 
>>> wrote:
>>> I noticed that in the main branch, the ec2 directory along with the 
>>> spark-ec2 script is no longer present.
>>> 
>>> Is spark-ec2 going away in the next release? If so, what would be the best 
>>> alternative at that time?
>>> 
>>> A couple more additional questions:
>>> 1. Is there any way to add/remove additional workers while the cluster is 
>>> running without stopping/starting the EC2 cluster?
>>> 2. For 1, if no such capability is provided with the current script., do we 
>>> have to write it ourselves? Or is there any plan in the future to add such 
>>> functions?
>>> 2. In PySpark, is it possible to dynamically change driver/executor memory, 
>>> number of cores per executor without having to restart it? (e.g. via 
>>> changing sc configuration or recreating sc?)
>>> 
>>> Our ideal scenario is to keep running PySpark (in our case, as a notebook) 
>>> and connect/disconnect to any spark clusters on demand.
> 

Reply via email to