[ 
https://issues.apache.org/jira/browse/SPARK-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354956#comment-14354956
 ] 

Nicholas Chammas commented on SPARK-4325:
-----------------------------------------

At this point it's more an umbrella task containing any issues that impact 
spark-ec2 cluster launch times. Dunno if that's appropriate, but I've seen 
other issues structured like this.

I'm fine with closing this issue, but it's what I'm using to group issues 
related to the same problem.

> Improve spark-ec2 cluster launch times
> --------------------------------------
>
>                 Key: SPARK-4325
>                 URL: https://issues.apache.org/jira/browse/SPARK-4325
>             Project: Spark
>          Issue Type: Improvement
>          Components: EC2
>            Reporter: Nicholas Chammas
>            Assignee: Nicholas Chammas
>            Priority: Minor
>             Fix For: 1.3.0
>
>
> There are several optimizations we know we can make to [{{setup.sh}} | 
> https://github.com/mesos/spark-ec2/blob/v4/setup.sh] to make cluster launches 
> faster.
> There are also some improvements to the AMIs that will help a lot.
> Potential improvements:
> * Upgrade the Spark AMIs and pre-install tools like Ganglia on them. This 
> will reduce or eliminate SSH wait time and Ganglia init time.
> * Replace instances of {{download; rsync to rest of cluster}} with parallel 
> downloads on all nodes of the cluster.
> * Replace instances of 
>  {code}
> for node in $NODES; do
>   command
>   sleep 0.3
> done
> wait{code}
>  with simpler calls to {{pssh}}.
> * Remove the [linear backoff | 
> https://github.com/apache/spark/blob/b32734e12d5197bad26c080e529edd875604c6fb/ec2/spark_ec2.py#L665]
>  when we wait for SSH availability now that we are already waiting for EC2 
> status checks to clear before testing SSH.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to