[ 
https://issues.apache.org/jira/browse/SPARK-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14203757#comment-14203757
 ] 

Shivaram Venkataraman commented on SPARK-3821:
----------------------------------------------

[~nchammas] Thanks for putting this together -- This is looking great ! I just 
had a couple of quick questions, clarifications

1. My preference would be to just have a single AMI across Spark versions for a 
couple of reasons. First it reduces steps for every release (even though 
creating AMIs is definitely much simpler now !). Also the number of AMIs we 
maintain could get large if we do this for every minor and major release like 
1.1.1. [~pwendell] could probably comment more on the release process etc.

2. Could you clarify if Hadoop is pre-installed in new AMIs or are is it still 
installed on startup ? The flexibility we right now have of switching between 
Hadoop 1, Hadoop 2, YARN etc. is useful for testing. (Related packer question: 
Are the [init scripts| 
https://github.com/nchammas/spark-ec2/blob/packer/packer/spark-packer.json#L129]
 run during AMI creation or during startup ?)

3. Do you have some benchmarks for the new AMI without Spark 1.1.0 
pre-installed ? [We right now have old AMI vs. new AMI with 
spark|https://github.com/nchammas/spark-ec2/blob/packer/packer/proposal.md#new-amis---latest-os-updates-and-spark-110-pre-installed-single-run]
 . I see a couple of huge wins in the new AMI (from SSH wait time, ganglia init 
etc.) which I guess we should get even without Spark being pre-installed.

> Develop an automated way of creating Spark images (AMI, Docker, and others)
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-3821
>                 URL: https://issues.apache.org/jira/browse/SPARK-3821
>             Project: Spark
>          Issue Type: Improvement
>          Components: Build, EC2
>            Reporter: Nicholas Chammas
>            Assignee: Nicholas Chammas
>         Attachments: packer-proposal.html
>
>
> Right now the creation of Spark AMIs or Docker containers is done manually. 
> With tools like [Packer|http://www.packer.io/], we should be able to automate 
> this work, and do so in such a way that multiple types of machine images can 
> be created from a single template.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to