[
https://issues.apache.org/jira/browse/SPARK-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen updated SPARK-4325:
-----------------------------
Fix Version/s: (was: 1.3.0)
> Improve spark-ec2 cluster launch times
> --------------------------------------
>
> Key: SPARK-4325
> URL: https://issues.apache.org/jira/browse/SPARK-4325
> Project: Spark
> Issue Type: Umbrella
> Components: EC2
> Reporter: Nicholas Chammas
> Assignee: Nicholas Chammas
> Priority: Minor
>
> This is an umbrella task to capture several pieces of work related to
> significantly improving spark-ec2 cluster launch times.
> There are several optimizations we know we can make to [{{setup.sh}} |
> https://github.com/mesos/spark-ec2/blob/v4/setup.sh] to make cluster launches
> faster.
> There are also some improvements to the AMIs that will help a lot.
> Potential improvements:
> * Upgrade the Spark AMIs and pre-install tools like Ganglia on them. This
> will reduce or eliminate SSH wait time and Ganglia init time.
> * Replace instances of {{download; rsync to rest of cluster}} with parallel
> downloads on all nodes of the cluster.
> * Replace instances of
> {code}
> for node in $NODES; do
> command
> sleep 0.3
> done
> wait{code}
> with simpler calls to {{pssh}}.
> * Remove the [linear backoff |
> https://github.com/apache/spark/blob/b32734e12d5197bad26c080e529edd875604c6fb/ec2/spark_ec2.py#L665]
> when we wait for SSH availability now that we are already waiting for EC2
> status checks to clear before testing SSH.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]