[
https://issues.apache.org/jira/browse/SPARK-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14203786#comment-14203786
]
Nicholas Chammas commented on SPARK-3821:
-----------------------------------------
Thanks for the feedback [~shivaram].
{quote}
1. My preference would be to just have a single AMI across Spark versions for a
couple of reasons.
{quote}
I agree. Maintaining images for specific versions of Spark is worth it only if
you're really crazy about getting the lowest cluster launch times possible.
Well, that was my [original motivation |
http://apache-spark-developers-list.1001551.n3.nabble.com/EC2-clusters-ready-in-launch-time-30-seconds-td7262.html]
for doing this work, but ultimately I agree the complexity is not worth it at
the moment. I'll take this out unless someone wants to advocate for leaving it
in.
{quote}
2. Could you clarify if Hadoop is pre-installed in new AMIs or are is it still
installed on startup ?
{quote}
Currently, I have it set to install Hadoop 2 on the AMIs with Spark
pre-installed. Again, this was done with the intention of aiming for the lowest
launch time possible, but if we'd like to do away with the Spark-pre-installed
AMIs then this is not an issue.
{quote}
Are the init scripts run during AMI creation or during startup ?
{quote}
For the AMIs with Spark pre-installed, they are run during AMI creation. That's
why the [init runtimes in the second benchmark |
https://github.com/nchammas/spark-ec2/blob/214d5e4cac392a0eac21f949fe25c0075044411f/packer/proposal.md#new-amis---latest-os-updates-and-spark-110-pre-installed-single-run]
are all 0 ms; the init script sees that such and such is already installed and
just exits.
{quote}
3. Do you have some benchmarks for the new AMI without Spark 1.1.0
pre-installed ?
{quote}
Nope, but I can run one and get back to you on Monday or Tuesday with those
numbers.
> Develop an automated way of creating Spark images (AMI, Docker, and others)
> ---------------------------------------------------------------------------
>
> Key: SPARK-3821
> URL: https://issues.apache.org/jira/browse/SPARK-3821
> Project: Spark
> Issue Type: Improvement
> Components: Build, EC2
> Reporter: Nicholas Chammas
> Assignee: Nicholas Chammas
> Attachments: packer-proposal.html
>
>
> Right now the creation of Spark AMIs or Docker containers is done manually.
> With tools like [Packer|http://www.packer.io/], we should be able to automate
> this work, and do so in such a way that multiple types of machine images can
> be created from a single template.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]