[ https://issues.apache.org/jira/browse/SPARK-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nicholas Chammas updated SPARK-5473: ------------------------------------ Description: If there is some fatal problem with launching a cluster, `spark-ec2` just hangs without giving the user useful feedback on what the problem is. This PR exposes the output of the SSH calls to the user if the SSH test fails during cluster launch for any reason but the instance status checks are all green. For example: ``` $ ./ec2/spark-ec2 -k key -i /incorrect/path/identity.pem --instance-type m3.medium --slaves 1 --zone us-east-1c launch "spark-test" Setting up security groups... Searching for existing cluster spark-test... Spark AMI: ami-35b1885c Launching instances... Launched 1 slaves in us-east-1c, regid = r-7dadd096 Launched master in us-east-1c, regid = r-fcadd017 Waiting for cluster to enter 'ssh-ready' state... Warning: SSH connection error. (This could be temporary.) Host: 127.0.0.1 SSH return code: 255 SSH output: Warning: Identity file /incorrect/path/identity.pem not accessible: No such file or directory. Warning: Permanently added '127.0.0.1' (RSA) to the list of known hosts. Permission denied (publickey). ``` This should give users enough information when some unrecoverable error occurs during launch so they can know to abort the launch. This will help avoid situations like the ones reported [here on Stack Overflow](http://stackoverflow.com/q/28002443/) and [here on the user list](http://mail-archives.apache.org/mod_mbox/spark-user/201501.mbox/%3c1422323829398-21381.p...@n3.nabble.com%3E), where the users couldn't tell what the problem was because it was being hidden by `spark-ec2`. This is a usability improvement that should be backported to 1.2. > Expose SSH failures after status checks pass > -------------------------------------------- > > Key: SPARK-5473 > URL: https://issues.apache.org/jira/browse/SPARK-5473 > Project: Spark > Issue Type: Improvement > Components: EC2 > Affects Versions: 1.2.0 > Reporter: Nicholas Chammas > Assignee: Nicholas Chammas > Priority: Minor > Fix For: 1.3.0 > > > If there is some fatal problem with launching a cluster, `spark-ec2` just > hangs without giving the user useful feedback on what the problem is. > This PR exposes the output of the SSH calls to the user if the SSH test fails > during cluster launch for any reason but the instance status checks are all > green. > For example: > ``` > $ ./ec2/spark-ec2 -k key -i /incorrect/path/identity.pem --instance-type > m3.medium --slaves 1 --zone us-east-1c launch "spark-test" > Setting up security groups... > Searching for existing cluster spark-test... > Spark AMI: ami-35b1885c > Launching instances... > Launched 1 slaves in us-east-1c, regid = r-7dadd096 > Launched master in us-east-1c, regid = r-fcadd017 > Waiting for cluster to enter 'ssh-ready' state... > Warning: SSH connection error. (This could be temporary.) > Host: 127.0.0.1 > SSH return code: 255 > SSH output: Warning: Identity file /incorrect/path/identity.pem not > accessible: No such file or directory. > Warning: Permanently added '127.0.0.1' (RSA) to the list of known hosts. > Permission denied (publickey). > ``` > This should give users enough information when some unrecoverable error > occurs during launch so they can know to abort the launch. This will help > avoid situations like the ones reported [here on Stack > Overflow](http://stackoverflow.com/q/28002443/) and [here on the user > list](http://mail-archives.apache.org/mod_mbox/spark-user/201501.mbox/%3c1422323829398-21381.p...@n3.nabble.com%3E), > where the users couldn't tell what the problem was because it was being > hidden by `spark-ec2`. > This is a usability improvement that should be backported to 1.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org